Word Embeddings: Stability and Semantic Change.

作者： Lucas Rettenmeier

DOI:

关键词:

摘要: Word embeddings are computed by a class of techniques within natural language processing (NLP), that create continuous vector representations words in from large text corpus. The stochastic nature the training process most embedding can lead to surprisingly strong instability, i.e. subsequently applying same technique data twice, produce entirely different results. In this work, we present an experimental study on instability three influential last decade: word2vec, GloVe and fastText. Based results, propose statistical model describe introduce novel metric measure representation individual word. Finally, method minimize - computing modified average over multiple runs apply it specific linguistic problem: detection quantification semantic change, measuring changes meaning usage time.

参考文章(80)

Philipp Koehn, Europarl: A Parallel Corpus for Statistical Machine Translation ,(2005)

Mark Davies, Corpus of Historical American English (COHA) Brigham Young University, Provo, UT. ,(2010)

Tomas Mikolov, Martin Karafiát, Sanjeev Khudanpur, Jan Cernocký, Lukás Burget, Recurrent neural network based language model conference of the international speech communication association. pp. 1045- 1048 ,(2010)

David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams, Learning representations by back-propagating errors Nature. ,vol. 323, pp. 696- 699 ,(1988) , 10.1038/323533A0

Hinrich Schütze, Christopher D. Manning, Prabhakar Raghavan, Introduction to Information Retrieval ,(2005)

Joseph Lilleberg, Yun Zhu, Yanqing Zhang, Support vector machines and Word2vec for text classification with semantic features ieee international conference on cognitive informatics and cognitive computing. pp. 136- 140 ,(2015) , 10.1109/ICCI-CC.2015.7259377

Vivek Kulkarni, Rami Al-Rfou, Bryan Perozzi, Steven Skiena, Statistically Significant Detection of Linguistic Change the web conference. pp. 625- 635 ,(2015) , 10.1145/2736277.2741627

Larry C. Andrews, Special Functions of Mathematics for Engineers ,(1991)

Omer Levy, Yoav Goldberg, Ido Dagan, Improving Distributional Similarity with Lessons Learned from Word Embeddings Transactions of the Association for Computational Linguistics. ,vol. 3, pp. 211- 225 ,(2015) , 10.1162/TACL_A_00134

10.

Thomas K. Landauer, Susan T. Dumais, A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. Psychological Review. ,vol. 104, pp. 211- 240 ,(1997) , 10.1037/0033-295X.104.2.211

Word Embeddings: Stability and Semantic Change.

来源期刊

我的账户

Word Embeddings: Stability and Semantic Change.

来源期刊

相似文章 0

我的账户