CompareWords: Measuring semantic change in word usage in different corpora

作者: Stephen Taylor , Ondřej Pražák , Pavel Přibáň

DOI: 10.1016/J.SIMPA.2021.100067

关键词: Semantic changeMeasure (data warehouse)Artificial intelligenceComputer scienceMeaning (existential)Word usageSoftware packageTopic areasNatural language processing

摘要: Abstract We present CompareWords; A software package developed for measuring semantic change of particular words between two corpora. have used it changes in meaning time periods, but could also be to measure different topic areas or literary genres. Our technique uses word-embeddings each corpus, and cross-lingual transformations. Thus requires the corpora large enough train good word-embeddings.

参考文章(7)
Tomáš Brychcín, Stephen Taylor, Lukáš Svoboda, Cross-lingual word analogies using linear transformations between semantic spaces Expert Systems With Applications. ,vol. 135, pp. 287- 295 ,(2019) , 10.1016/J.ESWA.2019.06.021
Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean, None, Efficient Estimation of Word Representations in Vector Space arXiv: Computation and Language. ,(2013)
Mikel Artetxe, Gorka Labaka, Eneko Agirre, A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings meeting of the association for computational linguistics. ,vol. 1, pp. 789- 798 ,(2018) , 10.18653/V1/P18-1073
Nina Tahmasebi, Barbara McGillivray, Haim Dubossarsky, Dominik Schlechtweg, Simon Hengchen, SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection arXiv: Computation and Language. ,(2020)
Stephen Taylor, Ondřej Pražák, Pavel Přibáň, UWB @ DIACR-Ita: Lexical Semantic Change Detection with CCA and Orthogonal Transformation arXiv: Computation and Language. ,(2020)
Stephen Taylor, Ondřej Pražák, Jakub Sido, Pavel Přibáň, UWB at SemEval-2020 Task 1: Lexical Semantic Change Detection arXiv: Computation and Language. ,(2020)
Tommaso Caselli, Annalina Caputo, Pierpaolo Basile, Pierluigi Cassotti, Rossella Varvara, DIACR-Ita @ EVALITA2020: Overview of the EVALITA2020 Diachronic Lexical Semantics (DIACR-Ita) Task 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop, EVALITA 2020. ,vol. 2765, ,(2020) , 10.4000/BOOKS.AACCADEMIA.7613