作者: John E. Miller , Tiago Tresoldi , Roberto Zariquiey , César A. Beltrán Castañón , Natalia Morozova
DOI: 10.1371/JOURNAL.PONE.0242709
关键词:
摘要: Lexical borrowing, the transfer of words from one language to another, is most frequent processes in evolution. In order detect borrowings, linguists make use various strategies, combining evidence sources. Despite increasing popularity computational approaches comparative linguistics, automated lexical borrowing detection are still their infancy, disregarding many aspects that routinely considered by human experts. One example for this kind phonological and phonotactic clues especially useful recent borrowings have not yet been adapted structure recipient languages. study, we test how these can be exploited frameworks detection. By modeling phonology phonotactics with support Support Vector Machines, Markov models, recurrent neural networks, propose a framework supervised mono-lingual wordlists. Based on substantially revised dataset which thoroughly annotated 41 different languages families, featuring large typological diversity, models conduct series experiments investigate performance While general results appear largely unsatisfying at first glance, further tests show our improves amounts attested those cases where were introduced donor alone. Our derived monolingual data alone often sufficient when using them isolation. detailed findings, however, express hope they could prove integrated take multi-lingual information into account.