作者: Taraka Rama
DOI: 10.18653/V1/K18-1027
关键词: Task (project management) 、 Word list 、 Computer science 、 Artificial intelligence 、 Cognate 、 Chinese restaurant process 、 Similarity (network science) 、 Identification (information) 、 Natural language processing 、 Cluster analysis 、 Language family
摘要: We present and evaluate two similarity dependent Chinese Restaurant Process (sd-CRP) algorithms at the task of automated cognate detection. The sd-CRP clustering do not require any predefined threshold for detecting sets in a multilingual word list. performance on six language families (more than 750 languages) find that both variants performs as well InfoMap better UPGMA inferring clusters. presented this paper are family agnostic can be applied to linguistically under-studied family.