We investigate how residue structural and physicochemical environment information, such as the protein secondary structure and residue solvent accessibility could be used for protein structural classes (all-alpha, all-beta, alpha/beta and alpha+beta) prediction. The residue environment information is described by the residue environment profiles which are derived from a relative small set of 500 protein
sequences having a sequence identity less than 25%. It was demonstrated that this method is able to obtain an accuracy of 49.2% for a 4-type class prediction of monomeric and non-disulphide-bonded
proteins, given the fact that none of the nonclassified protein sequences has a sequence identity higher than 25%. This result is comparable to the amino acid composition method which obtains an accuracy of 48% for a set of sequences having sequence similarity of less than 30%. The current approach has several advantages: (1) it is a physical approach, (2) there is no adjustable parameter, and (3) it is
simple and efficient.
Asian Journal of Health and Information Sciences 1(3):332-342