作者: David C Haws , Peter Huggins , Eric M O’Neill , David W Weisrock , Ruriko Yoshida
关键词:
摘要: The increased use of multi-locus data sets for phylogenetic reconstruction has the need to determine whether a set gene trees significantly deviate from patterns other genes. Such unusual may have been influenced by evolutionary processes such as selection, duplication, or horizontal transfer. Motivated this problem we propose nonparametric goodness-of-fit test two empirical distributions trees, and developed software GeneOut estimate p-value test. Our approach maps into multi-dimensional vector space then applies support machines (SVMs) measure separation between pre-defined trees. We permutation assess significance SVM separation. To demonstrate performance GeneOut, applied it comparison simulated within different species across range tree depths. Applied directly with large sample sizes, was able detect very small differences generated under statistical can also include its framework through variety optimality criteria. When DNA sequence results in form receiver operating characteristic (ROC) curves indicated that performed well detection space. Furthermore, controlled false positive negative rates well, indicating high degree accuracy. non-parametric nature our provides fast efficient analyses, makes an applicable any scenario where factors lead distributions. is freely available GNU public license.