Prediction of lysine propionylation sites using biased SVM and incorporating four different sequence features into Chou's PseAAC.

作者: Zhe Ju , Jian-Jun He

DOI: 10.1016/J.JMGM.2017.07.022

关键词:

摘要: Abstract Lysine propionylation is an important and common protein acylation modification in both prokaryotes eukaryotes. To better understand the molecular mechanism of propionylation, it to identify propionylated substrates their corresponding sites accurately. In this study, a novel bioinformatics tool named PropPred developed predict by using multiple feature extraction biased support vector machine. On one hand, various features are incorporated, including amino acid composition, factors, binary encoding, composition k-spaced pairs. And F-score method incremental selection algorithm adopted remove redundant features. other machine used handle imbalanced problem training dataset. As illustrated 10-fold cross-validation, performance achieves satisfactory with Sensitivity 70.03%, Specificity 75.61%, accuracy 75.02% Matthew’s correlation coefficient 0.3085. Feature analysis shows that some factors play most roles prediction sites. These results might provide clues for understanding mechanisms propionylation. A user-friendly web-server established at 123.206.31.171/PropPred/.

参考文章(79)
Wei Chen, Pengmian Feng, Hui Ding, Hao Lin, Kuo-Chen Chou, iRNA-Methyl: Identifying N(6)-methyladenosine sites using pseudo nucleotide composition. Analytical Biochemistry. ,vol. 490, pp. 26- 33 ,(2015) , 10.1016/J.AB.2015.08.021
Bin Liu, Fule Liu, Xiaolong Wang, Junjie Chen, Longyun Fang, Kuo-Chen Chou, Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences Nucleic Acids Research. ,vol. 43, ,(2015) , 10.1093/NAR/GKV458
N Cristianini, Icg Campbell, K Veropoulos, Controlling the Sensitivity of Support Vector Machines pp. 55- 60 ,(1999)
Wei Chen, Hao Lin, Kuo-Chen Chou, Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences Molecular BioSystems. ,vol. 11, pp. 2620- 2634 ,(2015) , 10.1039/C5MB00155B
W. R. Atchley, J. Zhao, A. D. Fernandes, T. Druke, Solving the protein sequence metric problem Proceedings of the National Academy of Sciences of the United States of America. ,vol. 102, pp. 6395- 6400 ,(2005) , 10.1073/PNAS.0408677102
Hiroki Okanishi, Kwang Kim, Ryoji Masui, Seiki Kuramitsu, Lysine propionylation is a prevalent post-translational modification in Thermus thermophilus Molecular & Cellular Proteomics. ,vol. 13, pp. 2382- 2398 ,(2014) , 10.1074/MCP.M113.035659
Zhihong Zhang, Minjia Tan, Zhongyu Xie, Lunzhi Dai, Yue Chen, Yingming Zhao, Identification of lysine succinylation as a new post-translational modification Nature Chemical Biology. ,vol. 7, pp. 58- 63 ,(2011) , 10.1038/NCHEMBIO.495
Zi Liu, Xuan Xiao, Wang-Ren Qiu, Kuo-Chen Chou, iDNA-Methyl: identifying DNA methylation sites via pseudo trinucleotide composition. Analytical Biochemistry. ,vol. 474, pp. 69- 77 ,(2015) , 10.1016/J.AB.2014.12.009
Yu-Dong Cai, Guo-Ping Zhou, Kuo-Chen Chou, Predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition Journal of Theoretical Biology. ,vol. 234, pp. 145- 149 ,(2005) , 10.1016/J.JTBI.2004.11.017