Biologically, the function of a protein is highly related to its subcellular location. It is of necessity to develop a reliable method for protein subcellular location prediction, especially when a large amount of proteins are to be analyzed. Various methods have been proposed to perform the task. The results, however, are not satisfactory in terms of effectiveness and efficiency. A hybrid approach combining na�ve Bayesian classifier and k-nearest neighbor classifier is proposed to classify eukaryotic proteins represented as a combination of amino acid composition, dipeptide composition, and functional domain composition. Experimental results show that the total accuracy of a set of 17,655 proteins can reach up to 91.5%.
Journal of Information Science and Engineering 24(5):1361-1375