Simple task-specific bilingual word embeddings

作者: Stephan Gouws , Anders Søgaard

DOI: 10.3115/V1/N15-1157

关键词:

摘要: We introduce a simple wrapper method that uses off-the-shelf word embedding algorithms to learn task-specific bilingual embeddings. use small dictionary of easily-obtainable equivalence classes produce mixed context-target pairs we train models. Our model has the advantage it (a) is independent choice algorithm, (b) does not require parallel data, and (c) can be adapted specific tasks by re-defining classes. show how our outperforms embeddings on task unsupervised cross-language partof-speech (POS) tagging, as well semi-supervised super sense (SuS) tagging.

参考文章(15)
Jakob Uszkoreit, Oscar Täckström, Ryan McDonald, Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure north american chapter of the association for computational linguistics. pp. 477- 487 ,(2012)
Taylor Berg-Kirkpatrick, Alexandre Bouchard-Côté, John DeNero, Dan Klein, Painless Unsupervised Learning with Features north american chapter of the association for computational linguistics. pp. 582- 590 ,(2010)
Massimiliano Ciaramita, Yasemin Altun, Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger empirical methods in natural language processing. pp. 594- 602 ,(2006) , 10.3115/1610075.1610158
Karl Moritz Hermann, Phil Blunsom, Multilingual Models for Compositional Distributed Semantics Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 58- 68 ,(2014) , 10.3115/V1/P14-1006
Tomáš Kočiský, Karl Moritz Hermann, Phil Blunsom, Learning Bilingual Word Representations by Marginalizing Alignments Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). pp. 224- 229 ,(2014) , 10.3115/V1/P14-2037
Yoshua Bengio, Joseph Turian, Lev-Arie Ratinov, Word Representations: A Simple and General Method for Semi-Supervised Learning meeting of the association for computational linguistics. pp. 384- 394 ,(2010)
Hal Daumé, John Langford, Daniel Marcu, Search-based structured prediction Machine Learning. ,vol. 75, pp. 297- 325 ,(2009) , 10.1007/S10994-009-5106-X
Oscar Täckström, Dipanjan Das, Slav Petrov, Ryan McDonald, Joakim Nivre, Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging Transactions of the Association for Computational Linguistics. ,vol. 1, pp. 1- 12 ,(2013) , 10.1162/TACL_A_00205
Geoffrey Hinton, Laurens van der Maaten, Visualizing Data using t-SNE Journal of Machine Learning Research. ,vol. 9, pp. 2579- 2605 ,(2008)
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, Bing Qin, Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ,vol. 1, pp. 1555- 1565 ,(2014) , 10.3115/V1/P14-1146