作者: Chao Deng , Timothy Daley , Andrew Smith
DOI: 10.1007/S40484-015-0049-7
关键词:
摘要: The species accumulation curve, or collector’s of a population gives the expected number observed distinct classes as function sampling effort. Species curves allow researchers to assess and compare diversity across populations evaluate benefits additional sampling. Traditional applications have focused on ecological but emerging large-scale applications, for example in DNA sequencing, are orders magnitude larger present new challenges. We developed method estimate predicting complexity sequencing libraries. This uses rational approximations classical nonparametric empirical Bayes estimator due Good Toulmin [Biometrika, 1956, 43, 45–63]. Here we demonstrate how same approach can be highly effective other involving biological data sets. These include estimating microbial richness, immune repertoire size, k-mer genome assembly applications. show modified address containing an effectively infinite where saturation cannot practically attained. also introduce flexible suite tools implemented R package that make these methods broadly accessible.