Bilingual Learning of Multi-sense Embeddings with Discrete Autoencoders

作者: Simon Šuster , Ivan Titov , Gertjan van Noord

DOI: 10.18653/V1/N16-1160

关键词:

摘要: We present an approach to learning multi-sense word embeddings relying both on monolingual and bilingual information. Our model consists of encoder, which uses context (i.e. a parallel sentence) choose sense for given word, decoder predicts words based the chosen sense. The two components are estimated jointly. observe that representations induced from data outperform counterparts across range evaluation tasks, even though crosslingual information is not available at test time.

参考文章(69)
John DeNero, Mohit Bansal, Dekang Lin, Unsupervised Translation Sense Clustering north american chapter of the association for computational linguistics. pp. 773- 782 ,(2012)
Manaal Faruqui, Chris Dyer, Improving Vector Space Word Representations Using Multilingual Correlation conference of the european chapter of the association for computational linguistics. pp. 462- 471 ,(2014) , 10.3115/V1/E14-1049
B. T. S. Atkins, Michael Rundell, The Oxford Guide to Practical Lexicography ,(2008)
Jiwei Li, Dan Jurafsky, Do Multi-Sense Embeddings Improve Natural Language Understanding? empirical methods in natural language processing. pp. 1722- 1732 ,(2015) , 10.18653/V1/D15-1200
Sergey Bartunov, Dmitry P. Vetrov, Anton Osokin, Dmitry Kondrashkin, Breaking Sticks and Ambiguities with Adaptive Skip-gram international conference on artificial intelligence and statistics. pp. 130- 138 ,(2016)
Tomas Mikolov, Greg S. Corrado, Kai Chen, Jeffrey Dean, Efficient Estimation of Word Representations in Vector Space international conference on learning representations. ,(2013)
Nancy Ide, Jean Véronis, Introduction to the special issue on word sense disambiguation: the state of the art Computational Linguistics. ,vol. 24, pp. 2- 40 ,(1998)
Jakob Uszkoreit, Oscar Täckström, Ryan McDonald, Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure north american chapter of the association for computational linguistics. pp. 477- 487 ,(2012)
Yoshua Bengio, Greg Corrado, Stephan Gouws, BilBOWA: Fast Bilingual Distributed Representations without Word Alignments international conference on machine learning. pp. 748- 756 ,(2015)