Computing P-values for peptide identifications in mass spectrometry

作者: Tema Fridman , Robert M. Day , Andrey A. Gorin , Nikita Arnold

DOI:

关键词:

摘要: Mass-spectrometry (MS) is a powerful experimental technology for"sequencing" proteins in complex biological mixtures. Computational methodsare essential for the interpretation of MS data, and number theoretical questionsremain unresolved due to intrinsic complexity related algorithms.Here we design an analytical approach estimate confidence values peptideidentification so-called database search methods. The exploresproperties mass tags -- sequences (m1 m2...mn), where individualmass are distances between spectral lines. We define p-function probability finding random match any given tag proteindatabase verify concept with extensive experiments.We then discuss properties, its applications highly reliablematches experiments, possibility analytically evaluateproperties SEQUEST X-correlation function.

参考文章(17)
Roger Higdon, Jason M. Hogan, Gerald Van Belle, Eugene Kolker, Randomized sequence databases for tandem mass spectrometry peptide and protein identification. Omics A Journal of Integrative Biology. ,vol. 9, pp. 364- 379 ,(2005) , 10.1089/OMI.2005.9.364
Shamil Sunyaev, Adam J. Liska, Alexander Golod, Anna Shevchenko, Andrej Shevchenko, MultiTag: multiple error-tolerant sequence tag search for the sequence-similarity identification of proteins by mass spectrometry. Analytical Chemistry. ,vol. 75, pp. 1307- 1315 ,(2003) , 10.1021/AC026199A
David N. Perkins, Darryl J. C. Pappin, David M. Creasy, John S. Cottrell, Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. ,vol. 20, pp. 3551- 3567 ,(1999) , 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
Hyungwon Choi, Debashis Ghosh, Alexey I. Nesvizhskii, Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling. Journal of Proteome Research. ,vol. 7, pp. 286- 292 ,(2008) , 10.1021/PR7006818
David L. Tabb, W. Hayes McDonald, John R. Yates, DTASelect and Contrast: Tools for Assembling and Comparing Protein Identifications from Shotgun Proteomics Journal of Proteome Research. ,vol. 1, pp. 21- 26 ,(2002) , 10.1021/PR015504Q
Roger Higdon, Jason M. Hogan, Natali Kolker, Gerald van Belle, Eugene Kolker, Experiment-specific estimation of peptide identification probabilities using a randomized database. Omics A Journal of Integrative Biology. ,vol. 11, pp. 351- 365 ,(2007) , 10.1089/OMI.2007.0040
Alexey I. Nesvizhskii, Andrew Keller, Eugene Kolker, Ruedi Aebersold, A statistical model for identifying proteins by tandem mass spectrometry. Analytical Chemistry. ,vol. 75, pp. 4646- 4658 ,(2003) , 10.1021/AC0341261
Jimmy K. Eng, Ashley L. McCormack, John R. Yates, An Approach to Correlate Tandem Mass Spectral Data of Peptides with Amino Acid Sequences in a Protein Database Journal of the American Society for Mass Spectrometry. ,vol. 5, pp. 976- 989 ,(1994) , 10.1016/1044-0305(94)80016-2
M. Mann, M. Wilm, Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Analytical Chemistry. ,vol. 66, pp. 4390- 4399 ,(1994) , 10.1021/AC00096A002