Transliteration Generation and Mining with Limited Training Resources

作者: Sittichai Jiampojamarn , Grzegorz Kondrak , Shane Bergsma , Qing Dou , Kenneth Dwyer

DOI:

关键词: Artificial intelligenceMachine learningData miningTransliterationComputer scienceDiscriminative modelRange (mathematics)

摘要: We present DirecTL+: an online discriminative sequence prediction model based on many-to-many alignments, which is further augmented by the incorporation of joint n-gram features. Experimental results show improvement over achieved DirecTL in 2009. also explore a number diverse resource-free and language-independent approaches to transliteration mining, range from simple sophisticated.

参考文章(17)
Grzegorz Kondrak, Aditya Bhargava, Language identification of names with SVMs north american chapter of the association for computational linguistics. pp. 693- 696 ,(2010)
Sittichai Jiampojamarn, Grzegorz Kondrak, Colin Cherry, Integrating Joint n-gram Features into a Discriminative Training Framework north american chapter of the association for computational linguistics. pp. 697- 700 ,(2010)
Nello Cristianini, John Shawe-Taylor, Kernel Methods for Pattern Analysis ,(2004)
Andrew McCallum, Kedar Bellare, Fernando Pereira, A conditional random field for discriminatively-trained finite-state string edit distance uncertainty in artificial intelligence. pp. 388- 395 ,(2005) , 10.21236/ADA440386
Ultraconservative online algorithms for multiclass problems Journal of Machine Learning Research. ,vol. 3, pp. 951- 991 ,(2003) , 10.1162/JMLR.2003.3.4-5.951
Grzegorz Kondrak, A new algorithm for the alignment of phonetic sequences north american chapter of the association for computational linguistics. pp. 288- 295 ,(2000)
Alexandre Klementiev, Dan Roth, Named Entity Transliteration and Discovery from Multilingual Comparable Corpora language and technology conference. pp. 82- 88 ,(2006) , 10.3115/1220835.1220846
Grzegorz Kondrak, Shane Bergsma, Alignment-Based Discriminative String Similarity meeting of the association for computational linguistics. pp. 656- 663 ,(2007)
E.S. Ristad, P.N. Yianilos, Learning string-edit distance IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 20, pp. 522- 532 ,(1998) , 10.1109/34.682181
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Chih-Jen Lin, Xiang-Rui Wang, LIBLINEAR: A Library for Large Linear Classification Journal of Machine Learning Research. ,vol. 9, pp. 1871- 1874 ,(2008)