F-Seq2: improving the feature density based peak caller with dynamic statistics.

作者: Alan P Boyle , Nanxiang Zhao

DOI: 10.1093/NARGAB/LQAB012

关键词:

摘要: Genomic and epigenomic features are captured at a genome-wide level by using high-throughput sequencing (HTS) technologies. Peak calling delineates identified in HTS experiments, such as open chromatin regions transcription factor binding sites, comparing the observed read distributions to random expectation. Since its introduction, F-Seq has been widely used shown be most sensitive accurate peak caller for DNase I hypersensitive site (DNase-seq) data. However, first release (F-Seq1) two key limitations: lack of support user-input control datasets, poor test statistic reporting. These constrain ability capture systematic experimental biases inherent background prediction, subsequently rank predicted peaks confidence. To address these limitations, we present F-Seq2, which combines kernel density estimation dynamic 'continuous' Poisson account local accurately candidate peaks. The output F-Seq2 is suitable irreproducible discovery rate analysis statistics calculated individual summits, allowing direct comparison predictions across replicates. improvements significantly boost performance ATAC-seq ChIP-seq outperforming competing callers ENCODE Consortium terms precision recall.

参考文章(27)
Parameswaran Ramachandran, Theodore J Perkins, Adaptive bandwidth kernel density estimation for next-generation sequencing data BMC Proceedings. ,vol. 7, pp. 1- 10 ,(2013) , 10.1186/1753-6561-7-S7-S7
Murray Rosenblatt, Remarks on Some Nonparametric Estimates of a Density Function Annals of Mathematical Statistics. ,vol. 27, pp. 832- 837 ,(1956) , 10.1214/AOMS/1177728190
Gordon Robertson, Martin Hirst, Matthew Bainbridge, Misha Bilenky, Yongjun Zhao, Thomas Zeng, Ghia Euskirchen, Bridget Bernier, Richard Varhol, Allen Delaney, Nina Thiessen, Obi L Griffith, Ann He, Marco Marra, Michael Snyder, Steven Jones, Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing Nature Methods. ,vol. 4, pp. 651- 657 ,(2007) , 10.1038/NMETH1068
Alan P. Boyle, Sean Davis, Hennady P. Shulha, Paul Meltzer, Elliott H. Margulies, Zhiping Weng, Terrence S. Furey, Gregory E. Crawford, High-Resolution Mapping and Characterization of Open Chromatin across the Genome Cell. ,vol. 132, pp. 311- 322 ,(2008) , 10.1016/J.CELL.2007.12.014
Anton Valouev, David S Johnson, Andreas Sundquist, Catherine Medina, Elizabeth Anton, Serafim Batzoglou, Richard M Myers, Arend Sidow, Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data Nature Methods. ,vol. 5, pp. 829- 834 ,(2008) , 10.1038/NMETH.1246
Peter V Kharchenko, Michael Y Tolstorukov, Peter J Park, Design and analysis of ChIP-seq experiments for DNA-binding proteins Nature Biotechnology. ,vol. 26, pp. 1351- 1359 ,(2008) , 10.1038/NBT.1508
Yuchun Guo, Shaun Mahony, David K. Gifford, High Resolution Genome Wide Binding Event Finding and Motif Discovery Reveals Transcription Factor Spatial Binding Constraints PLoS Computational Biology. ,vol. 8, pp. e1002638- ,(2012) , 10.1371/JOURNAL.PCBI.1002638
Yoav Benjamini, Yosef Hochberg, Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 57, pp. 289- 300 ,(1995) , 10.1111/J.2517-6161.1995.TB02031.X
Jason D Buenrostro, Paul G Giresi, Lisa C Zaba, Howard Y Chang, William J Greenleaf, Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position Nature Methods. ,vol. 10, pp. 1213- 1218 ,(2013) , 10.1038/NMETH.2688