CAMS-RS: clustering algorithm for large-scale mass spectrometry data using restricted search space and intelligent random sampling

作者: Fahad Saeed , Jason D. Hoffert , Mark A. Knepper

DOI: 10.1109/TCBB.2013.152

关键词:

摘要: High-throughput mass spectrometers can produce massive amounts of redundant data at an astonishing rate with many them having poor signal-to-noise (S/N) ratio. These low S/N ratio spectra may not get interpreted using conventional spectra-to-database matching techniques. In this paper, we present efficient algorithm, CAMS-RS (Clustering Algorithm for Mass Spectra Restricted Space and Sampling) clustering raw spectrometry data. utilizes a novel metric (called F-set) that exploits the temporal spatial patterns to accurately assess similarity between two given spectra. The F-set is independent retention time allows from LC-MS/MS runs. A restricted search space strategy devised limit comparisons number An intelligent sampling method executed on individual bins allow merging results make final clusters. Our experiments, experimentally generated sets, show proposed algorithm able cluster high accuracy helpful in interpreting highly scalable increasing our implementation up million within minutes.

参考文章(32)
Julian P. Whitelegge, HPLC and Mass Spectrometry of Intrinsic Membrane Proteins HPLC of Peptides and Proteins. ,vol. 251, pp. 323- 340 ,(2004) , 10.1385/1-59259-742-4:323
Kristian Flikka, Jeroen Meukens, Kenny Helsens, Joël Vandekerckhove, Ingvar Eidhammer, Kris Gevaert, Lennart Martens, Implementation and application of a versatile clustering tool for tandem mass spectrometry data. Proteomics. ,vol. 7, pp. 3245- 3258 ,(2007) , 10.1002/PMIC.200700160
Michael F. Moran, Jiefei Tong, Paul Taylor, Robert M. Ewing, Emerging applications for phospho-proteomics in cancer molecular therapeutics Biochimica et Biophysica Acta. ,vol. 1766, pp. 230- 241 ,(2006) , 10.1016/J.BBCAN.2006.06.002
Ari M. Frank, Nuno Bandeira, Zhouxin Shen, Stephen Tanner, Steven P. Briggs, Richard D. Smith, Pavel A. Pevzner, Clustering Millions of Tandem Mass Spectra Journal of Proteome Research. ,vol. 7, pp. 113- 122 ,(2008) , 10.1021/PR070361E
Fahad Saeed, Trairak Pisitkun, Jason D. Hoffert, Guanghui Wang, Marjan Gucek, Mark A. Knepper, An efficient dynamic programming algorithm for phosphorylation site assignment of large-scale mass spectrometry data bioinformatics and biomedicine. pp. 618- 625 ,(2012) , 10.1109/BIBMW.2012.6470210
H. Molina, D. M. Horn, N. Tang, S. Mathivanan, A. Pandey, Global proteomic profiling of phosphopeptides using electron transfer dissociation tandem mass spectrometry Proceedings of the National Academy of Sciences of the United States of America. ,vol. 104, pp. 2199- 2204 ,(2007) , 10.1073/PNAS.0611217104
David L. Tabb, Melissa R. Thompson, Gurusahai Khalsa-Moyers, Nathan C. VerBerkmoes, W. Hayes McDonald, MS2Grouper: group assessment and synthetic replacement of duplicate proteomic tandem mass spectra. Journal of the American Society for Mass Spectrometry. ,vol. 16, pp. 1250- 1261 ,(2005) , 10.1016/J.JASMS.2005.04.010
Ilan Beer, Eilon Barnea, Tamar Ziv, Arie Admon, Improving large-scale proteomics by clustering of mass spectrometry data. Proteomics. ,vol. 4, pp. 950- 960 ,(2004) , 10.1002/PMIC.200300652
J. D. Hoffert, T. Pisitkun, G. Wang, R.-F. Shen, M. A. Knepper, Quantitative phosphoproteomics of vasopressin-sensitive renal cells: Regulation of aquaporin-2 phosphorylation at two sites Proceedings of the National Academy of Sciences of the United States of America. ,vol. 103, pp. 7159- 7164 ,(2006) , 10.1073/PNAS.0600895103
David L. Tabb, Michael J. MacCoss, Christine C. Wu, Scott D. Anderson, John R. Yates, Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility. Analytical Chemistry. ,vol. 75, pp. 2470- 2477 ,(2003) , 10.1021/AC026424O