Classification and Analysis of Regulatory Pathways Using Graph Property, Biochemical and Physicochemical Property, and Functional Property

作者: Tao Huang , Lei Chen , Yu-Dong Cai , Kuo-Chen Chou

DOI: 10.1371/JOURNAL.PONE.0025297

关键词: Pseudo amino acid compositionComputational biologyGraph propertySystems biologyInformation processingFeature (machine learning)BioinformaticsRegulatory PathwayProperty (philosophy)Redundancy (engineering)Biology

摘要: Given a regulatory pathway system consisting of set proteins, can we predict which class it belongs to? Such problem is closely related to the biological function in cells and hence quite fundamental essential systems biology proteomics. This also an extremely difficult challenging due its complexity. To address this problem, novel approach was developed that be used query pathways among following six functional categories: (i) “Metabolism”, (ii) “Genetic Information Processing”, (iii) “Environmental (iv) “Cellular Processes”, (v) “Organismal Systems”, (vi) “Human Diseases”. The prediction method established trough procedures: according general form pseudo amino acid composition (PseAAC), each concerned formulated as 5570-D (dimensional) vector; components vector derived by series feature extractions from graphic property, biochemical physicochemical well property; minimum redundancy maximum relevance (mRMR) adopted operate prediction. A cross-validation jackknife test on benchmark dataset 146 indicated overall success rate 78.8% achieved our identifying above classes, indicating outcome promising encouraging. best knowledge, current study represents first effort attempting identity type or function. It anticipated report may stimulate follow-up investigations new area.

参考文章(95)
Deepayan Chakrabarti, Christos Faloutsos, Tools for large graph mining Carnegie Mellon University. ,(2005)
Amos Bairoch, Philipp Bucher, PROSITE: recent developments Nucleic Acids Research. ,vol. 22, pp. 3583- 3589 ,(1994) , 10.1093/NAR/22.17.3626
John C. Platt, Fast training of support vector machines using sequential minimal optimization Advances in kernel methods. pp. 185- 208 ,(1999)
K C Chou, Graphic rules in steady and non-steady state enzyme kinetics. Journal of Biological Chemistry. ,vol. 264, pp. 12074- 12079 ,(1989) , 10.1016/S0021-9258(18)80175-2
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Remco R. Bouckaert, Bayesian network classifiers in Weka Department of Computer Science. ,(2004)
I.W. Althaus, A.J. Gonzales, J.J. Chou, D.L. Romero, M.R. Deibel, K.C. Chou, F.J. Kezdy, L. Resnick, M.E. Busso, A.G. So, The quinoline U-78036 is a potent inhibitor of HIV-1 reverse transcriptase Journal of Biological Chemistry. ,vol. 268, pp. 14875- 14880 ,(1993) , 10.1016/S0021-9258(18)82414-0
Nir Friedman, Dan Geiger, Moises Goldszmidt, Bayesian Network Classifiers Machine Learning. ,vol. 29, pp. 131- 163 ,(1997) , 10.1023/A:1007465528199