Using lexical language models to detect borrowings in monolingual wordlists.

作者: John E. Miller , Tiago Tresoldi , Roberto Zariquiey , César A. Beltrán Castañón , Natalia Morozova

DOI: 10.1371/JOURNAL.PONE.0242709

关键词:

摘要: Lexical borrowing, the transfer of words from one language to another, is most frequent processes in evolution. In order detect borrowings, linguists make use various strategies, combining evidence sources. Despite increasing popularity computational approaches comparative linguistics, automated lexical borrowing detection are still their infancy, disregarding many aspects that routinely considered by human experts. One example for this kind phonological and phonotactic clues especially useful recent borrowings have not yet been adapted structure recipient languages. study, we test how these can be exploited frameworks detection. By modeling phonology phonotactics with support Support Vector Machines, Markov models, recurrent neural networks, propose a framework supervised mono-lingual wordlists. Based on substantially revised dataset which thoroughly annotated 41 different languages families, featuring large typological diversity, models conduct series experiments investigate performance While general results appear largely unsatisfying at first glance, further tests show our improves amounts attested those cases where were introduced donor alone. Our derived monolingual data alone often sufficient when using them isolation. detailed findings, however, express hope they could prove integrated take multi-lingual information into account.

参考文章(43)
Dan Jurafsky, James H. Martin, Speech and Language Processing ,(1999)
Morris Swadesh, Lexico-Statistical Dating of Prehistoric Ethnic Contacts Proceedings of The American Philosophical Society. ,vol. 96, pp. 452- 463 ,(1952)
Bernard Comrie, The Intercontinental Dictionary Series Max Planck Institute for Evolutionary Anthropology, Leipzig. ,(2011)
Robert Tibshirani, Trevor Hastie, Jerome H. Friedman, The Elements of Statistical Learning ,(2001)
Russell D. Gray, Simon J. Greenhill, Quentin D. Atkinson, Phylogenetic models of language change : three new questions MIT Press. pp. 285- 302 ,(2013) , 10.7551/MITPRESS/9780262019750.003.0015
Edward Loper, Ewan Klein, Steven Bird, Natural Language Processing with Python ,(2009)
Johann‐Mattis List, Shijulal Nelson‐Sathi, Hans Geisler, William Martin, Networks of lexical borrowing and lateral gene transfer in language and genome evolution BioEssays. ,vol. 36, pp. 141- 150 ,(2014) , 10.1002/BIES.201300096