An Integrated Perspective on Phylogenetic Workflows.

作者: August Guang , Felipe Zapata , Mark Howison , Charles E. Lawrence , Casey W. Dunn

DOI: 10.1016/J.TREE.2015.12.007

关键词: Phylogenetic treeInferenceBioinformaticsBiologyGenerative modelComponent (UML)Machine learningTree (data structure)Identification (biology)Molecular phylogeneticsArtificial intelligencePhylogenetics

摘要: Molecular phylogenetics is the study of evolutionary relationships between biological sequences, often to infer organisms. These studies require many analysis components, including sequence assembly, identification homologous gene tree inference, and species inference. At present, each component usually treated as a single step in linear analysis, where output passed input next point estimate. Here we outline generative model that helps clarify assumptions are implicit phylogenetic workflows, focusing on assumption low relative entropy. This perspective unifies currently disparate advances, will help investigators evaluate which steps would benefit most from additional computation future methods development.

参考文章(90)
Christopher James Langmead, Generative models of conformational dynamics. Advances in Experimental Medicine and Biology. ,vol. 805, pp. 87- 105 ,(2014) , 10.1007/978-3-319-02970-2_4
Laura Salter Kubatko, L. Lacey Knowles, Estimating species trees : practical and theoretical aspects Wiley-Blackwell. ,(2010)
Martyn P. Clark, Dmitri Kavetski, Fabrizio Fenicia, Pursuing the method of multiple working hypotheses for hydrological modeling Water Resources Research. ,vol. 47, ,(2011) , 10.1029/2010WR009827
A. Loytynoja, N. Goldman, An algorithm for progressive multiple alignment of sequences with insertions Proceedings of the National Academy of Sciences of the United States of America. ,vol. 102, pp. 10557- 10562 ,(2005) , 10.1073/PNAS.0409137102
Gergely J. Szöllősi, Vincent Daubin, Modeling gene family evolution and reconciling phylogenetic discord. Methods of Molecular Biology. ,vol. 856, pp. 29- 51 ,(2012) , 10.1007/978-1-61779-585-5_2
David M Blei, Andrew Y Ng, Michael I Jordan, None, Latent dirichlet allocation Journal of Machine Learning Research. ,vol. 3, pp. 993- 1022 ,(2003) , 10.5555/944919.944937
Amir Szitenberg, Max John, Mark L Blaxter, David H Lunt, ReproPhylo: An Environment for Reproducible Phylogenomics bioRxiv. pp. 019349- ,(2015) , 10.1101/019349
A DEQUEIROZ, J GATESY, The supermatrix approach to systematics Trends in Ecology and Evolution. ,vol. 22, pp. 34- 41 ,(2007) , 10.1016/J.TREE.2006.10.002
Mark Howison, Felipe Zapata, Erika J. Edwards, Casey W. Dunn, Bayesian Genome Assembly and Assessment by Markov Chain Monte Carlo Sampling PLoS ONE. ,vol. 9, pp. e99497- ,(2014) , 10.1371/JOURNAL.PONE.0099497
J. T. Simpson, K. Wong, S. D. Jackman, J. E. Schein, S. J.M. Jones, I. Birol, ABySS: A parallel assembler for short read sequence data Genome Research. ,vol. 19, pp. 1117- 1123 ,(2009) , 10.1101/GR.089532.108