Data Driven Models for Language Evolution

作者： Antonella Delmestri

DOI:

关键词: Ancestor 、 Language family 、 Cognate 、 Tree (data structure) 、 Natural language processing 、 Natural language 、 String metric 、 Identification (biology) 、 Artificial intelligence 、 Computer science 、 Variation (linguistics)

摘要: Natural languages that originate from a common ancestor are genetically related, words the core of any language and cognates sharing same etymology. Cognate identification, therefore, represents foundation upon which evolutionary history may be discovered, while linguistic phylogenetic inference aims to estimate genetic relationships exist between them. In this thesis, using several techniques originally developed for biological sequence analysis, we have designed data driven orthographic learning system measuring string similarity successfully applied it tasks cognate identification inference. Our has outperformed best comparable phonetic models previously reported in literature, with results statistically significant remarkably stable, regardless variation training dataset dimension. When Indo-European family, whose higher structure does not yet consensus, our method estimated phylogenies compatible benchmark tree reproduced correctly all established major groups subgroups present dataset.

unitn.it 本地加速

amazon.com 本地加速

ox.ac.uk 本地加速

暂无可下载资源，当前可以选择系统获取到有开放资源时通知我或者直接发起求助文献求助

参考文章(104)

Kevin Knight, Philipp Koehn, Knowledge Sources for Word-Level Translation Models empirical methods in natural language processing. pp. 27- 35 ,(2001)

Paul M. Lewis, Ethnologue : languages of the world SIL International. ,(2009)

Roger K. Moore, Computer Speech and Language Elsevier Publishing Company. ,(1986)

Heinrich Wagner, Linguistic Atlas and Survey of Irish Dialects ,(1958)

D Sankhoff, J Kruskal, Time Warps, String Edits, and Macromolecules CSLI Publications. ,(1999)

Russell D. Gray, Clare J. Holden, Rapid radiation, borrowing and dialect continua in the Bantu languages McDonald Institute for Archaeological Research. pp. 19- 31 ,(2006)

Russell D. Gray, Geoff K. Nicholls, Quantifying uncertainty in a stochastic model of vocabulary evolution McDonald Institute for Archaeological Research. pp. 161- 171 ,(2006)

I. Dan Melamed, Bitext maps and alignment via pattern recognition Computational Linguistics. ,vol. 25, pp. 107- 130 ,(1999)

Frederick Jelinek, Statistical methods for speech recognition ,(1997)

10.

Elena Deza, Michel-Marie Deza, Dictionary of distances ,(2006)

Data Driven Models for Language Evolution

来源期刊

我的账户

Data Driven Models for Language Evolution

来源期刊

相似文章 1

COMPARATIVE METHOD ALGORITHM

我的账户