Fuzzy translation of cross-lingual spelling variants

作者: Ari Pirkola , Jarmo Toivonen , Heikki Keskustalo , Kari Visala , Kalervo Järvelin

DOI: 10.1145/860435.860498

关键词:

摘要: We will present a novel two-step fuzzy translation technique for cross-lingual spelling variants. In the first stage, transformation rules are applied to source words render them more similar their target language equivalents. The generated automatically using dictionaries as data. second intermediate forms obtained in stage translated into matching. effectiveness of was evaluated empirically five languages and English language. word list contained 189 000 with correct equivalents among them. were technique, results compared those plain matching based translation. combined performed better, sometimes considerably than alone.

参考文章(10)
Kalervo Järvelin, Heikki Keskustalo, Ari Pirkola, Erkka Leppänen, Antti-Pekka Känsälä, Targeted s-gram matching: a novel n-gram matching technique for cross- and mono-lingual word form variants. Information Research. ,vol. 7, ,(2002)
Norbert Fuhr, Thomas Poersch, Ulrich Pfeifer, Searching Proper Names in Databases. HIM. pp. 259- 275 ,(1995)
Gerard Salton, Automatic text processing: the transformation, analysis, and retrieval of information by computer Addison-Wesley Longman Publishing Co., Inc.. ,(1989)
Justin Zobel, Philip Dart, Phonetic string matching: lessons from information retrieval international acm sigir conference on research and development in information retrieval. pp. 166- 172 ,(1996) , 10.1145/243199.243258
Bonnie Glover Stalls, Kevin Knight, Translating names and technical terms in Arabic text international conference on computational linguistics. pp. 34- 41 ,(1998) , 10.3115/1621753.1621760
Ulrich Pfeifer, Thomas Poersch, Norbert Fuhr, Retrieval effectiveness of proper name search methods Information Processing and Management. ,vol. 32, pp. 667- 679 ,(1996) , 10.1016/S0306-4573(96)00042-8
Alexander M. Robertson, Peter Willett, Applications of n‐grams in textual information systems Journal of Documentation. ,vol. 54, pp. 48- 67 ,(1998) , 10.1108/EUM0000000007161
Michael A. Covington, An algorithm to align words for historical comparison Computational Linguistics. ,vol. 22, pp. 481- 496 ,(1996) , 10.5555/256329.256333
Douglas W. Oard, Anne R. Diekema, Cross-Language Information Retrieval. Annual Review of Information Science and Technology (ARIST). ,vol. 33, pp. 223- 256 ,(1998)
Jonathan Graehl, Kevin Knight, Machine transliteration Computational Linguistics. ,vol. 24, pp. 599- 612 ,(1998)