Learning from Decoys to Improve the Sensitivity and Specificity of Proteomics Database Search Results

作者: Amit Kumar Yadav , Dhirendra Kumar , Debasis Dash , None

DOI: 10.1371/JOURNAL.PONE.0050651

关键词:

摘要: The statistical validation of database search results is a complex issue in bottom-up proteomics. correct and incorrect peptide spectrum match (PSM) scores overlap significantly, making an accurate assessment true matches challenging. Since the complete separation between false hits practically never achieved, there need for better methods rescoring algorithms to improve upon primary results. Here we describe calibration False Discovery Rate (FDR) estimation through dynamic FDR calculation method, FlexiFDR, which increases both sensitivity specificity Modelling simple linear regression on decoy different charge states, method maximized number positives reduced negatives several standard datasets varying complexity (18-mix, 49-mix, 200-mix) few (E. coli Yeast) obtained from wide variety MS platforms. net positive gain spectral identifications was up 14.81% 6.2% respectively. approach applicable methodologies- separate as well concatenated search, high mass accuracy, semi-tryptic modification searches. FlexiFDR also applied Mascot showed performance than before. We have shown that appropriate threshold learnt decoys, can be very effective improving adapts itself instruments, data types It learns sets flexible automatically aligns underlying variables quality size.

参考文章(49)
Bobbie-Jo M. Webb-Robertson, Support vector machines for improved peptide identification from tandem mass spectrometry database search. Methods of Molecular Biology. ,vol. 492, pp. 453- 460 ,(2009) , 10.1007/978-1-59745-493-3_28
Samuel Purvine, Alex F. Picone, Eugene Kolker, Standard Mixtures for Proteome Studies OMICS: A Journal of Integrative Biology. ,vol. 8, pp. 79- 92 ,(2004) , 10.1089/153623104773547507
Andrew Keller, Samuel Purvine, Alexey I. Nesvizhskii, Sergey Stolyar, David R. Goodlett, Eugene Kolker, Experimental Protein Mixture for Validating Tandem Mass Spectral Analysis OMICS: A Journal of Integrative Biology. ,vol. 6, pp. 207- 212 ,(2002) , 10.1089/153623102760092805
J. D. Storey, R. Tibshirani, Statistical significance for genomewide studies Proceedings of the National Academy of Sciences of the United States of America. ,vol. 100, pp. 9440- 9445 ,(2003) , 10.1073/PNAS.1530509100
Eugene A. Kapp, Frédéric Schütz, Gavin E. Reid, James S. Eddes, Robert L. Moritz, Richard A. J. O'Hair, Terence P. Speed, Richard J. Simpson, Mining a tandem mass spectrometry database to determine the trends and global factors influencing peptide fragmentation. Analytical Chemistry. ,vol. 75, pp. 6251- 6264 ,(2003) , 10.1021/AC034616T
Min-Sik Kim, Kumaran Kandasamy, Raghothama Chaerkady, Akhilesh Pandey, None, Assessment of resolution parameters for CID-based shotgun proteomic experiments on the LTQ-Orbitrap mass spectrometer Journal of the American Society for Mass Spectrometry. ,vol. 21, pp. 1606- 1611 ,(2010) , 10.1016/J.JASMS.2010.04.011
Roger Higdon, Natali Kolker, Alex Picone, Gerald Van Belle, Eugene Kolker, LIP index for peptide classification using MS/MS and SEQUEST search via logistic regression. Omics A Journal of Integrative Biology. ,vol. 8, pp. 357- 369 ,(2004) , 10.1089/OMI.2004.8.357
Hyungwon Choi, Alexey I. Nesvizhskii, Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics. Journal of Proteome Research. ,vol. 7, pp. 254- 265 ,(2008) , 10.1021/PR070542G
Marina Spivak, Jason Weston, Léon Bottou, Lukas Käll, William Stafford Noble, Improvements to the Percolator Algorithm for Peptide Identification from Shotgun Proteomics Data Sets Journal of Proteome Research. ,vol. 8, pp. 3737- 3745 ,(2009) , 10.1021/PR801109K
Joshua E Elias, Francis D Gibbons, Oliver D King, Frederick P Roth, Steven P Gygi, Intensity-based protein identification by machine learning from a library of tandem mass spectra Nature Biotechnology. ,vol. 22, pp. 214- 219 ,(2004) , 10.1038/NBT930