Linguistic Divergence of Sinhala and Tamil Languages in Machine Translation

作者: W.S.N. Dilshani , S. Yashothara , R. T. Uthayasanker , S. Jayasena

DOI: 10.1109/IALP.2018.8629113

关键词: TamilNatural language processingDivergence (linguistics)Machine translationSemanticsComputer scienceArtificial intelligence

摘要: This paper presents a study of the lexical-semantic divergence between Sinhala and Tamil languages. Study is critical as differences in linguistic extra-linguistic features languages play pivotal roles translation. research first based on Dorr's classification. We propose computer-assisted procedure using statistical machine translation, which easy gives good performance compared to traditional approaches. Accordingly, this has twin aims revisiting classification types outlined by Dorr outlining some new patterns specific proposes rule-based algorithm classify divergence.

参考文章(13)
James Barnett, Inderjeet Mani, Elaine Rich, Reversible Machine Translation: What to Do When the Languages Don’t Match Up Reversible Grammar in Natural Language Processing. pp. 321- 364 ,(1994) , 10.1007/978-1-4615-2722-0_13
Niladri Sekhar Dash, Linguistic Divergences in English to Bengali Translation International Journal of English Linguistics. ,vol. 3, pp. 31- ,(2013) , 10.5539/IJEL.V3N1P31
Abdus Saboor, Mohammad Abid Khan, Lexical-semantic divergence in Urdu-to-English Example Based Machine Translation international conference on emerging technologies. pp. 316- 320 ,(2010) , 10.1109/ICET.2010.5638469
S. B.Kulkarni, P. D. Deshmukh, M. M. Kazi, K. V. Kale, Linguistic Divergence Patterns in English to Marathi Translation International Journal of Computer Applications. ,vol. 87, pp. 21- 26 ,(2014) , 10.5120/15197-3579
Philipp Koehn, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, Evan Herbst, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Moses: Open Source Toolkit for Statistical Machine Translation meeting of the association for computational linguistics. pp. 177- 180 ,(2007) , 10.3115/1557769.1557821
Kenneth Heafield, KenLM: Faster and Smaller Language Model Queries workshop on statistical machine translation. pp. 187- 197 ,(2011)
Franz Josef Och, Minimum error rate training in statistical machine translation Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL '03. pp. 160- 167 ,(2003) , 10.3115/1075096.1075117
Liang Huang, David Chiang, Forest Rescoring: Faster Decoding with Integrated Language Models meeting of the association for computational linguistics. pp. 144- 151 ,(2007)
H.M.N.D. Hearth, Y.W.S.N. Amarasooriya, R.A.U.M. Ranathunga, A Comparative Analysis on Cases in Sinhalese and Tamil Languages Department of Linguistics, University of Kelaniya, Sri Lanka. ,(2016)