A support vector machine based test for incongruence between sets of trees in tree space

作者: David C Haws , Peter Huggins , Eric M O’Neill , David W Weisrock , Ruriko Yoshida

DOI: 10.1186/1471-2105-13-210

关键词:

摘要: The increased use of multi-locus data sets for phylogenetic reconstruction has the need to determine whether a set gene trees significantly deviate from patterns other genes. Such unusual may have been influenced by evolutionary processes such as selection, duplication, or horizontal transfer. Motivated this problem we propose nonparametric goodness-of-fit test two empirical distributions trees, and developed software GeneOut estimate p-value test. Our approach maps into multi-dimensional vector space then applies support machines (SVMs) measure separation between pre-defined trees. We permutation assess significance SVM separation. To demonstrate performance GeneOut, applied it comparison simulated within different species across range tree depths. Applied directly with large sample sizes, was able detect very small differences generated under statistical can also include its framework through variety optimality criteria. When DNA sequence results in form receiver operating characteristic (ROC) curves indicated that performed well detection space. Furthermore, controlled false positive negative rates well, indicating high degree accuracy. non-parametric nature our provides fast efficient analyses, makes an applicable any scenario where factors lead distributions. is freely available GNU public license.

参考文章(37)
Polina Golland, Feng Liang, Sayan Mukherjee, Dmitry Panchenko, Permutation tests for classification conference on learning theory. pp. 501- 515 ,(2005) , 10.1007/11503415_34
Ramon C. Littell, SAS for Linear Models ,(2002)
William S Noble, What is a support vector machine Nature Biotechnology. ,vol. 24, pp. 1565- 1567 ,(2006) , 10.1038/NBT1206-1565
G. F. Estabrook, F. R. McMorris, C. A. Meacham, COMPARISON OF UNDIRECTED PHYLOGENETIC TREES BASED ON SUBTREES OF FOUR EVOLUTIONARY UNITS Systematic Biology. ,vol. 34, pp. 193- 200 ,(1985) , 10.2307/SYSBIO/34.2.193
Elissaveta Arnaoudova, David C Haws, Peter Huggins, Jerzy W Jaromczyk, Neil Moore, Christopher L Schardl, Ruriko Yoshida, Statistical Phylogenetic Tree Analysis Using Differences of Means Frontiers in Neuroscience. ,vol. 4, pp. 47- ,(2010) , 10.3389/FNINS.2010.00047
Masami Hasegawa, Hirohisa Kishino, Taka-aki Yano, Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution. ,vol. 22, pp. 160- 174 ,(1985) , 10.1007/BF02101694
Joseph Felsenstein, Phylogenies and the Comparative Method The American Naturalist. ,vol. 125, pp. 1- 15 ,(1985) , 10.1086/284325