Deconvolute individual genomes from metagenome sequences through short read clustering.

作者: Kexue Li , Yakang Lu , Li Deng , Lili Wang , Lizhen Shi

DOI: 10.7717/PEERJ.8966

关键词:

摘要: Metagenome assembly from short next-generation sequencing data is a challenging process due to its large scale and computational complexity. Clustering reads by species before offers unique opportunity for parallel downstream of genomes with individualized optimization. However, current read clustering methods suffer either false negative (under-clustering) or positive (over-clustering) problems. Here we extended our previous software, SpaRC, exploiting statistics derived multiple samples in dataset reduce the under-clustering problem. Using synthetic real-world datasets demonstrated that this method has potential cluster almost all sufficient coverage. The improved turn leads genome quality.

参考文章(30)
Ruiqi Liao, Ruichang Zhang, Jihong Guan, Shuigeng Zhou, A new unsupervised binning approach for metagenomic sequences based on N-grams and automatic feature weighting IEEE/ACM Transactions on Computational Biology and Bioinformatics. ,vol. 11, pp. 42- 54 ,(2014) , 10.1109/TCBB.2013.137
Chien-Chi Lo, Patrick S G Chain, Rapid evaluation and quality control of next generation sequencing data with FaQCs BMC Bioinformatics. ,vol. 15, pp. 366- 366 ,(2014) , 10.1186/S12859-014-0366-2
Benny Chor, David Horn, Nick Goldman, Yaron Levy, Tim Massingham, Genomic DNA k-mer spectra: models and modalities Genome Biology. ,vol. 10, pp. 1- 10 ,(2009) , 10.1186/GB-2009-10-10-R108
R. Chikhi, P. Medvedev, Informed and automated k-mer size selection for genome assembly Bioinformatics. ,vol. 30, pp. 31- 37 ,(2014) , 10.1093/BIOINFORMATICS/BTT310
Weibing Shi, Christina D. Moon, Sinead C. Leahy, Dongwan Kang, Jeff Froula, Sandra Kittelmann, Christina Fan, Samuel Deutsch, Dragana Gagic, Henning Seedorf, William J. Kelly, Renee Atua, Carrie Sang, Priya Soni, Dong Li, Cesar S. Pinares-Patiño, John C. McEwan, Peter H. Janssen, Feng Chen, Axel Visel, Zhong Wang, Graeme T. Attwood, Edward M. Rubin, Methane yield phenotypes linked to differential gene expression in the sheep rumen microbiome Genome Research. ,vol. 24, pp. 1517- 1525 ,(2014) , 10.1101/GR.168245.113
Dinghua Li, Chi-Man Liu, Ruibang Luo, Kunihiko Sadakane, Tak-Wah Lam, MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph Bioinformatics. ,vol. 31, pp. 1674- 1676 ,(2015) , 10.1093/BIOINFORMATICS/BTV033
Torsten Thomas, Jack Gilbert, Folker Meyer, Metagenomics - a guide from sampling to data analysis. Microbial Informatics and Experimentation. ,vol. 2, pp. 3- 3 ,(2012) , 10.1186/2042-5783-2-3
Shinichi Sunagawa, Luis Pedro Coelho, Samuel Chaffron, Jens Roat Kultima, Karine Labadie, Guillem Salazar, Bardya Djahanschiri, Georg Zeller, Daniel R Mende, Adriana Alberti, Francisco M Cornejo-Castillo, Paul I Costea, Corinne Cruaud, Francesco d'Ovidio, Stefan Engelen, Isabel Ferrera, Josep M Gasol, Lionel Guidi, Falk Hildebrand, Florian Kokoszka, Cyrille Lepoivre, Gipsi Lima-Mendez, Julie Poulain, Bonnie T Poulos, Marta Royo-Llonch, Hugo Sarmento, Sara Vieira-Silva, Céline Dimier, Marc Picheral, Sarah Searson, Stefanie Kandels-Lewis, Tara Oceans coordinators, Chris Bowler, Colomban De Vargas, Gabriel Gorsky, Nigel Grimsley, Pascal Hingamp, Daniele Iudicone, Olivier Jaillon, Fabrice Not, Hiroyuki Ogata, Stephane Pesant, Sabrina Speich, Lars Stemmann, Matthew B Sullivan, Jean Weissenbach, Patrick Wincker, Eric Karsenti, Jeroen Raes, Silvia G Acinas, Peer Bork, Emmanuel Boss, Chris Bowler, Michael Follows, Lee Karp-Boss, Uros Krzic, Emmanuel G Reynaud, Christian Sardet, Mike Sieracki, Didier Velayoudon, Structure and function of the global ocean microbiome Science. ,vol. 348, pp. 1261359- 1261359 ,(2015) , 10.1126/SCIENCE.1261359
Susannah Green Tringe, Edward M. Rubin, Metagenomics: DNA sequencing of environmental samples. Nature Reviews Genetics. ,vol. 6, pp. 805- 814 ,(2005) , 10.1038/NRG1709