Linear transformations for cross-lingual semantic textual similarity

作者: Tomáš Brychcín

DOI: 10.1016/J.KNOSYS.2019.06.027

关键词:

摘要: Abstract Cross-lingual semantic textual similarity systems estimate the degree of meaning between two sentences, each in a different language. State-of-the-art algorithms usually employ machine translation and combine vast amount features, making approach strongly supervised, resource rich, difficult to use for poorly-resourced languages. In this paper, we study linear transformations, which project monolingual spaces into shared space using bilingual dictionaries. We propose novel transformation, builds on best ideas from prior works. experiment with unsupervised techniques sentence based only show they can be significantly improved by word weighting. Our transformation outperforms other methods together weighting leads very promising results several datasets

参考文章(37)
Manaal Faruqui, Chris Dyer, Improving Vector Space Word Representations Using Multilingual Correlation conference of the european chapter of the association for computational linguistics. pp. 462- 471 ,(2014) , 10.3115/V1/E14-1049
Iryna Gurevych, Torsten Zesch, Chris Biemann, Daniel Bär, UKP: Computing Semantic Textual Similarity by Combining Multiple Content Similarity Measures joint conference on lexical and computational semantics. ,vol. 1, pp. 435- 440 ,(2012)
Birk Diedenhofen, Jochen Musch, cocor: A Comprehensive Solution for the Statistical Comparison of Correlations PLOS ONE. ,vol. 10, pp. e0121945- ,(2015) , 10.1371/JOURNAL.PONE.0121945
Carl D. Meyer, Stephen L. Campbell, Generalized inverses of linear transformations ,(1979)
Francis Jeffry Pelletier, The Principle of Semantic Compositionality Topoi-an International Review of Philosophy. ,vol. 13, pp. 11- 24 ,(1994) , 10.1007/BF00763644
David R. Hardoon, Sandor Szedmak, John Shawe-Taylor, Canonical Correlation Analysis: An Overview with Application to Learning Methods Neural Computation. ,vol. 16, pp. 2639- 2664 ,(2004) , 10.1162/0899766042321814
Ilya Sutskever, Tomas Mikolov, Quoc V. Le, Exploiting Similarities among Languages for Machine Translation arXiv: Computation and Language. ,(2013)
Mirjana Ivanović, Alexandros Nanopoulos, Miloš Radovanović, Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data Journal of Machine Learning Research. ,vol. 11, pp. 2487- 2531 ,(2010)
Md Arafat Sultan, Steven Bethard, Tamara Sumner, DLS$@$CU: Sentence Similarity from Word Alignment and Semantic Vector Composition north american chapter of the association for computational linguistics. pp. 148- 153 ,(2015) , 10.18653/V1/S15-2027
Peter Young, Alice Lai, Micah Hodosh, Julia Hockenmaier, From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions Transactions of the Association for Computational Linguistics. ,vol. 2, pp. 67- 78 ,(2014) , 10.1162/TACL_A_00166