作者: Alan P Boyle , Nanxiang Zhao
关键词:
摘要: Genomic and epigenomic features are captured at a genome-wide level by using high-throughput sequencing (HTS) technologies. Peak calling delineates identified in HTS experiments, such as open chromatin regions transcription factor binding sites, comparing the observed read distributions to random expectation. Since its introduction, F-Seq has been widely used shown be most sensitive accurate peak caller for DNase I hypersensitive site (DNase-seq) data. However, first release (F-Seq1) two key limitations: lack of support user-input control datasets, poor test statistic reporting. These constrain ability capture systematic experimental biases inherent background prediction, subsequently rank predicted peaks confidence. To address these limitations, we present F-Seq2, which combines kernel density estimation dynamic 'continuous' Poisson account local accurately candidate peaks. The output F-Seq2 is suitable irreproducible discovery rate analysis statistics calculated individual summits, allowing direct comparison predictions across replicates. improvements significantly boost performance ATAC-seq ChIP-seq outperforming competing callers ENCODE Consortium terms precision recall.