An Ensemble Classifier with Random Projection for Predicting Protein-Protein Interactions Using Sequence and Evolutionary Information

作者: Xiao-Yu Song , Zhan-Heng Chen , Xiang-Yang Sun , Zhu-Hong You , Li-Ping Li

DOI: 10.3390/APP8010089

关键词: Singular value decompositionPattern recognitionSupport vector machineRandom projectionClassifier (UML)Protein–protein interactionDiscrete cosine transformFast Fourier transformComputer scienceArtificial intelligenceWord error rate

摘要: Identifying protein-protein interactions (PPIs) is crucial to comprehend various biological processes in cells. Although high-throughput techniques generate many PPI data for species, they are only a petty minority of the entire network. Furthermore, these approaches costly and time-consuming have high error rate. Therefore, it necessary design computational methods efficiently detecting PPIs. In this study, random projection ensemble classifier (RPEC) was explored identify novel PPIs using evolutionary information contained protein amino acid sequences. The obtained from position-specific scoring matrix (PSSM) generated PSI-BLAST. A feature fusion scheme then developed by combining discrete cosine transform (DCT), fast Fourier (FFT), singular value decomposition (SVD). Finally, via classifier, performance presented approach evaluated on Yeast, Human, H. pylori datasets 5-fold cross-validation. Our achieved prediction accuracies 95.64%, 96.59%, 87.62%, respectively, effectively outperforming other existing methods. Generally speaking, our quite promising supplies practical effective method predicting

参考文章(44)
Yu Zhen Zhou, Yun Gao, Ying Ying Zheng, Prediction of Protein-Protein Interactions Using Local Description of Amino Acid Sequence Communications in Computer and Information Science. pp. 254- 262 ,(2011) , 10.1007/978-3-642-22456-0_37
Norman E. Williams, Chapter 23 Immunoprecipitation Procedures Methods in Cell Biology. ,vol. 62, pp. 449- 453 ,(1999) , 10.1016/S0091-679X(08)61549-6
Alon Schclar, Lior Rokach, Random Projection Ensemble Classifiers international conference on enterprise information systems. pp. 309- 316 ,(2009) , 10.1007/978-3-642-01347-8_26
Heng Zhu, Metin Bilgin, Rhonda Bangham, David Hall, Antonio Casamayor, Paul Bertone, Ning Lan, Ronald Jansen, Scott Bidlingmaier, Thomas Houfek, Tom Mitchell, Perry Miller, Ralph A Dean, Mark Gerstein, Michael Snyder, None, Global Analysis of Protein Activities Using Proteome Chips Science. ,vol. 293, pp. 2101- 2105 ,(2001) , 10.1126/SCIENCE.1062191
Yu-An Huang, Zhu-Hong You, Xin Gao, Leon Wong, Lirong Wang, None, Using Weighted Sparse Representation Model Combined with Discrete Cosine Transformation to Predict Protein-Protein Interactions from Protein Sequence BioMed Research International. ,vol. 2015, pp. 902198- 902198 ,(2015) , 10.1155/2015/902198
M. Gribskov, A. D. McLachlan, D. Eisenberg, Profile analysis: detection of distantly related proteins. Proceedings of the National Academy of Sciences of the United States of America. ,vol. 84, pp. 4355- 4358 ,(1987) , 10.1073/PNAS.84.13.4355
Lakshmipuram S Swapna, Narayanaswamy Srinivasan, David L Robertson, Simon C Lovell, The origins of the evolutionary signal used to predict protein-protein interactions. BMC Evolutionary Biology. ,vol. 12, pp. 238- 238 ,(2012) , 10.1186/1471-2148-12-238
Shengli Zhang, Feng Ye, Xiguo Yuan, Using principal component analysis and support vector machine to predict protein structural class for low-similarity sequences via PSSM. Journal of Biomolecular Structure & Dynamics. ,vol. 29, pp. 634- 642 ,(2012) , 10.1080/07391102.2011.672627
J. Shen, J. Zhang, X. Luo, W. Zhu, K. Yu, K. Chen, Y. Li, H. Jiang, Predicting protein-protein interactions based only on sequences information. Proceedings of the National Academy of Sciences of the United States of America. ,vol. 104, pp. 4337- 4341 ,(2007) , 10.1073/PNAS.0607879104
Zhu-Hong You, Lin Zhu, Chun-Hou Zheng, Hong-Jie Yu, Su-Ping Deng, Zhen Ji, Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set BMC Bioinformatics. ,vol. 15, pp. 1- 9 ,(2014) , 10.1186/1471-2105-15-S15-S9