Phylogenetic Inference from Word Lists Using Weighted Alignment with Empirically Determined Weights

作者:

DOI: 10.1163/9789004281523_007

关键词:

摘要: The paper investigates the task of inferring a phylogenetic tree languages from collection word lists made available by Automated Similarity Judgment Project. This involves three steps: (1) computing pairwise distances, (2) aggregating distances to distance measure between and these (3) evaluating result comparing it expert classifications. For first task, weighted alignment will be used, method determine weights empirically presented. second novel developed that attempts minimize bias resulting missing data. third several methods literature applied large language samples enable statistical testing. It shown proposed here leads substantially more accurate phylogenies than relying on unweighted Levenshtein words.

参考文章(33)
Charles A. Ferguson, From esses to aitches: identifying pathways of diachronic change John Benjamins Publishing Company. pp. 59- ,(1990) , 10.1075/TSL.20.06FER
Isidore Dyen, Joseph B. Kruskal, Paul Black, An Indoeuropean Classification: A Lexicostatistical Experiment Transactions of the American Philosophical Society. ,vol. 82, pp. iii- ,(1992) , 10.2307/1006517
Paul M. Lewis, Ethnologue : languages of the world SIL International. ,(2009)
Matthew S Dryer, Martin Haspelmath, None, The World Atlas of Language Structures Online Max Planck Digital Library. ,(2013)
Søren Wichmann, Viveka Velupillai, Dik Bakker, Cecil H. Brown, André Müller, Eric W. Holman, Advances in automated language classification Quantitative Investigations in Theoretical Linguistics (QITL3). pp. 40- 43 ,(2008)
Patrick Hanks, Kenneth Ward Church, Word association norms, mutual information, and lexicography Computational Linguistics. ,vol. 16, pp. 22- 29 ,(1990) , 10.5555/89086.89095
G. F. Estabrook, F. R. McMorris, C. A. Meacham, COMPARISON OF UNDIRECTED PHYLOGENETIC TREES BASED ON SUBTREES OF FOUR EVOLUTIONARY UNITS Systematic Biology. ,vol. 34, pp. 193- 200 ,(1985) , 10.2307/SYSBIO/34.2.193
Russell D. Gray, Quentin D. Atkinson, Language-tree divergence times support the Anatolian theory of Indo-European origin Nature. ,vol. 426, pp. 435- 439 ,(2003) , 10.1038/NATURE02029
Aaron Clauset, Cosma Rohilla Shalizi, M. E. J. Newman, Power-Law Distributions in Empirical Data Siam Review. ,vol. 51, pp. 661- 703 ,(2009) , 10.1137/070710111