Improving draft genome contiguity with reference-derived in silico mate-pair libraries.

作者: José Horacio Grau , Thomas Hackl , Klaus-Peter Koepfli , Michael Hofreiter

DOI: 10.1093/GIGASCIENCE/GIY029

关键词:

摘要: Background Contiguous genome assemblies are a highly valued biological resource because of the higher number completely annotated genes and genomic elements that usable compared to fragmented draft genomes. Nonetheless, contiguity is difficult obtain if only low coverage data and/or distantly related reference available. Findings In order improve contiguity, we have developed Cross-Species Scaffolding—a new pipeline imports long-range distance information directly into de novo assembly process by constructing mate-pair libraries in silico. Conclusions We show how metrics gene prediction dramatically with our assembling two primate genomes solely based on ∼30x shotgun sequencing data.

参考文章(40)
Heng Li, BFC: correcting Illumina sequencing errors Bioinformatics. ,vol. 31, pp. 2885- 2887 ,(2015) , 10.1093/BIOINFORMATICS/BTV290
Tyler A Elliott, T Ryan Gregory, Do larger genomes contain more diverse transposable elements BMC Evolutionary Biology. ,vol. 15, pp. 69- 69 ,(2015) , 10.1186/S12862-015-0339-8
Robert Ekblom, Jochen B. W. Wolf, A field guide to whole-genome sequencing, assembly and annotation. Evolutionary Applications. ,vol. 7, pp. 1026- 1042 ,(2014) , 10.1111/EVA.12178
M. Stanke, O. Keller, I. Gunduz, A. Hayes, S. Waack, B. Morgenstern, AUGUSTUS: ab initio prediction of alternative transcripts Nucleic Acids Research. ,vol. 34, pp. 435- 439 ,(2006) , 10.1093/NAR/GKL200
Alexander S. Mikheyev, Mandy M. Y. Tin, A first look at the Oxford Nanopore MinION sequencer Molecular Ecology Resources. ,vol. 14, pp. 1097- 1102 ,(2014) , 10.1111/1755-0998.12324
Monya Baker, De novo genome assembly: what every biologist should know Nature Methods. ,vol. 9, pp. 333- 337 ,(2012) , 10.1038/NMETH.1935
Guillaume Marçais, Carl Kingsford, A fast, lock-free approach for efficient parallel counting of occurrences of k-mers Bioinformatics. ,vol. 27, pp. 764- 770 ,(2011) , 10.1093/BIOINFORMATICS/BTR011
Keith R Bradnam, Joseph N Fass, Anton Alexandrov, Paul Baranay, Michael Bechner, Inanç Birol, Sébastien Boisvert, Jarrod A Chapman, Guillaume Chapuis, Rayan Chikhi, Hamidreza Chitsaz, Wen-Chi Chou, Jacques Corbeil, Cristian Del Fabbro, T Roderick Docking, Richard Durbin, Dent Earl, Scott Emrich, Pavel Fedotov, Nuno A Fonseca, Ganeshkumar Ganapathy, Richard A Gibbs, Sante Gnerre, Élénie Godzaridis, Steve Goldstein, Matthias Haimel, Giles Hall, David Haussler, Joseph B Hiatt, Isaac Y Ho, Jason Howard, Martin Hunt, Shaun D Jackman, David B Jaffe, Erich D Jarvis, Huaiyang Jiang, Sergey Kazakov, Paul J Kersey, Jacob O Kitzman, James R Knight, Sergey Koren, Tak-Wah Lam, Dominique Lavenier, François Laviolette, Yingrui Li, Zhenyu Li, Binghang Liu, Yue Liu, Ruibang Luo, Iain MacCallum, Matthew D MacManes, Nicolas Maillet, Sergey Melnikov, Delphine Naquin, Zemin Ning, Thomas D Otto, Benedict Paten, Octávio S Paulo, Adam M Phillippy, Francisco Pina-Martins, Michael Place, Dariusz Przybylski, Xiang Qin, Carson Qu, Filipe J Ribeiro, Stephen Richards, Daniel S Rokhsar, J Graham Ruby, Simone Scalabrin, Michael C Schatz, David C Schwartz, Alexey Sergushichev, Ted Sharpe, Timothy I Shaw, Jay Shendure, Yujian Shi, Jared T Simpson, Henry Song, Fedor Tsarev, Francesco Vezzi, Riccardo Vicedomini, Bruno M Vieira, Jun Wang, Kim C Worley, Shuangye Yin, Siu-Ming Yiu, Jianying Yuan, Guojie Zhang, Hao Zhang, Shiguo Zhou, Ian F Korf, None, Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species GigaScience. ,vol. 2, pp. 10- 10 ,(2013) , 10.1186/2047-217X-2-10
H. Li, R. Durbin, Fast and accurate short read alignment with Burrows–Wheeler transform Bioinformatics. ,vol. 25, pp. 1754- 1760 ,(2009) , 10.1093/BIOINFORMATICS/BTP324
S. L. Salzberg, J. A. Yorke, Beware of mis-assembled genomes Bioinformatics. ,vol. 21, pp. 4320- 4321 ,(2005) , 10.1093/BIOINFORMATICS/BTI769