SHARP: Single-cell RNA-seq Hyper-fast and Accurate Processing via Ensemble Random Projection

作者: Shibiao Wan , Junil Kim , Kyoung Jae Won

DOI: 10.1101/461640

关键词:

摘要: To process large-scale single-cell RNA-sequencing (scRNA-seq) data effectively without excessive distortion during dimension reduction, we present SHARP, an ensemble random projection-based algorithm which is scalable to clustering 10 million cells. Comprehensive benchmarking tests on 17 public scRNA-seq datasets demonstrate that SHARP outperforms existing methods in terms of speed and accuracy. Particularly, for large-size (>40,000 cells), SHARP9s running far excels other competitors while maintaining high accuracy robustness. the best our knowledge, only R-based tool with

参考文章(40)
Spyros Darmanis, Steven A. Sloan, Ye Zhang, Martin Enge, Christine Caneda, Lawrence M. Shuer, Melanie G. Hayden Gephart, Ben A. Barres, Stephen R. Quake, A survey of human brain transcriptome diversity at the single cell level Proceedings of the National Academy of Sciences of the United States of America. ,vol. 112, pp. 7285- 7290 ,(2015) , 10.1073/PNAS.1507125112
Jacob H. Levine, Erin F. Simonds, Sean C. Bendall, Kara L. Davis, El-ad D. Amir, Michelle D. Tadmor, Oren Litvin, Harris G. Fienberg, Astraea Jager, Eli R. Zunder, Rachel Finck, Amanda L. Gedman, Ina Radtke, James R. Downing, Dana Pe’er, Garry P. Nolan, Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis. Cell. ,vol. 162, pp. 184- 197 ,(2015) , 10.1016/J.CELL.2015.05.047
Laurens Van Der Maaten, Accelerating t-SNE using tree-based algorithms Journal of Machine Learning Research. ,vol. 15, pp. 3221- 3245 ,(2014) , 10.5555/2627435.2697068
IT Joliffe, Bjt Morgan, Principal component analysis and exploratory factor analysis. Statistical Methods in Medical Research. ,vol. 1, pp. 69- 95 ,(1992) , 10.1177/096228029200100105
Joe H. Ward, Hierarchical Grouping to Optimize an Objective Function Journal of the American Statistical Association. ,vol. 58, pp. 236- 244 ,(1963) , 10.1080/01621459.1963.10500845
Peter W. Reddien, Alejandro Sánchez Alvarado, FUNDAMENTALS OF PLANARIAN REGENERATION Annual Review of Cell and Developmental Biology. ,vol. 20, pp. 725- 757 ,(2004) , 10.1146/ANNUREV.CELLBIO.20.010403.095114
Ping Li, Trevor J. Hastie, Kenneth W. Church, Very sparse random projections knowledge discovery and data mining. pp. 287- 296 ,(2006) , 10.1145/1150402.1150436
Fionn Murtagh, Pierre Legendre, Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion? Journal of Classification. ,vol. 31, pp. 274- 295 ,(2014) , 10.1007/S00357-014-9161-Z
Alicia Guemez-Gamboa, Nicole G. Coufal, Joseph G. Gleeson, Primary cilia in the developing and mature brain Neuron. ,vol. 82, pp. 511- 521 ,(2014) , 10.1016/J.NEURON.2014.04.024
Ella Bingham, Heikki Mannila, Random projection in dimensionality reduction: applications to image and text data knowledge discovery and data mining. pp. 245- 250 ,(2001) , 10.1145/502512.502546