作者:
DOI: 10.1163/9789004281523_007
关键词:
摘要: The paper investigates the task of inferring a phylogenetic tree languages from collection word lists made available by Automated Similarity Judgment Project. This involves three steps: (1) computing pairwise distances, (2) aggregating distances to distance measure between and these (3) evaluating result comparing it expert classifications. For first task, weighted alignment will be used, method determine weights empirically presented. second novel developed that attempts minimize bias resulting missing data. third several methods literature applied large language samples enable statistical testing. It shown proposed here leads substantially more accurate phylogenies than relying on unweighted Levenshtein words.