Prediction of membrane proteins using split amino acid and ensemble classification.

作者: Maqsood Hayat , Asifullah Khan , Mohammed Yeasin

DOI: 10.1007/S00726-011-1053-5

关键词: Support vector machineMembrane proteinAdaBoostk-nearest neighbors algorithmComputer scienceArtificial intelligencePseudo amino acid compositionRandom forestPattern recognitionBioinformaticsProbabilistic neural networkFeature extraction

摘要: Knowledge of the types membrane protein provides useful clues in deducing functions uncharacterized proteins. An automatic method for efficiently identifying proteins is thus highly desirable. In this work, we have developed a novel predicting by exploiting discrimination capability difference amino acid composition at N and C terminus through split (SAAC). We also show that ensemble classification can better exploit discriminating SAAC. study, are classified using three feature extraction several strategies. classifier Mem-EnsSAAC then best strategy. Pseudo (PseAA) composition, discrete wavelet analysis (DWT), SAAC, hybrid model employed extraction. The nearest neighbor, probabilistic neural network, support vector machine, random forest, Adaboost used as individual classifiers. predicted results learners combined genetic algorithm to form an classifier, yielding accuracy 92.4 92.2% Jackknife independent dataset test, respectively. Performance measures such MCC, sensitivity, specificity, F-measure, Q-statistics SAAC-based prediction yields significantly higher performance compared PseAA- DWT-based systems, reported so far. proposed able predict with high consequently, be very helpful drug discovery. It accessed http://111.68.99.218/membrane.

参考文章(42)
Kuo-Chen Chou, Chun-Ting Zhang, Predicting Protein Folding Types by Distance Functions That Make Allowances for Amino Acid Interactions Journal of Biological Chemistry. ,vol. 269, pp. 22014- 22020 ,(1994) , 10.1016/S0021-9258(17)31748-9
Hiroshi NAKASHIMA, Ken NISHIKAWA, Tatsuo OOI, The folding type of a protein is relevant to the amino acid composition. Journal of Biochemistry. ,vol. 99, pp. 153- 162 ,(1986) , 10.1093/OXFORDJOURNALS.JBCHEM.A135454
Jack Kyte, Russell F. Doolittle, A simple method for displaying the hydropathic character of a protein Journal of Molecular Biology. ,vol. 157, pp. 105- 132 ,(1982) , 10.1016/0022-2836(82)90515-0
Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee, Boosting the margin: a new explanation for the effectiveness of voting methods Annals of Statistics. ,vol. 26, pp. 1651- 1686 ,(1998) , 10.1214/AOS/1024691352
Kuo-Chen Chou, Hong-Bin Shen, Recent progress in protein subcellular location prediction Analytical Biochemistry. ,vol. 370, pp. 1- 16 ,(2007) , 10.1016/J.AB.2007.07.006
Hui Liu, Meng Wang, Kuo-Chen Chou, Low-frequency Fourier spectrum for predicting membrane protein types. Biochemical and Biophysical Research Communications. ,vol. 336, pp. 737- 739 ,(2005) , 10.1016/J.BBRC.2005.08.160
Loris Nanni, Alessandra Lumini, Ensemblator: An ensemble of classifiers for reliable classification of biological data Pattern Recognition Letters. ,vol. 28, pp. 622- 630 ,(2007) , 10.1016/J.PATREC.2006.10.012
Xi-Bin Zhou, Chao Chen, Zhan-Chao Li, Xiao-Yong Zou, Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. Journal of Theoretical Biology. ,vol. 248, pp. 546- 551 ,(2007) , 10.1016/J.JTBI.2007.06.001
Asifullah Khan, M.F. Khan, Tae-Sun Choi, Proximity based GPCRs prediction in transform domain Biochemical and Biophysical Research Communications. ,vol. 371, pp. 411- 415 ,(2008) , 10.1016/J.BBRC.2008.04.074