作者: Fahad Saeed , Jason D. Hoffert , Mark A. Knepper
关键词:
摘要: High-throughput mass spectrometers can produce massive amounts of redundant data at an astonishing rate with many them having poor signal-to-noise (S/N) ratio. These low S/N ratio spectra may not get interpreted using conventional spectra-to-database matching techniques. In this paper, we present efficient algorithm, CAMS-RS (Clustering Algorithm for Mass Spectra Restricted Space and Sampling) clustering raw spectrometry data. utilizes a novel metric (called F-set) that exploits the temporal spatial patterns to accurately assess similarity between two given spectra. The F-set is independent retention time allows from LC-MS/MS runs. A restricted search space strategy devised limit comparisons number An intelligent sampling method executed on individual bins allow merging results make final clusters. Our experiments, experimentally generated sets, show proposed algorithm able cluster high accuracy helpful in interpreting highly scalable increasing our implementation up million within minutes.