Arabic word analogies and semantics of simple phrases

作者: Stephen Taylor , Tomas Brychcin

DOI: 10.1109/ICNLSP.2018.8374386

关键词:

摘要: Vector semantic spaces, in which a multi-dimenstional numeric vector is used to represent the meaning of word, are making new natural language applications possible. Word analogies have become standard tool evaluate spaces. They also teach us something about what kinds information vectors space can embody. Arabic orthography has morphological constructs realized syntax some other languages: presence or absence article ??; bi-??, ka-?? prepositional prefixes; verbs with object suffixes constitute an entire sentence. The structured word-forms offer opportunity study how representations interact form verb phrases, noun and phrases. We provide corpus focused on participate these conducted examination ten different spaces see them most appropriate for this set analogies, we illustrate use examine phrase-building.

参考文章(15)
Mohamed A. Zahran, Ahmed Magooda, Ashraf Y. Mahgoub, Hazem Raafat, Mohsen Rashwan, Amir Atyia, Word Representations in Vector Space and their Applications for Arabic conference on intelligent text processing and computational linguistics. pp. 430- 443 ,(2015) , 10.1007/978-3-319-18111-0_32
Tomas Mikolov, Greg S. Corrado, Kai Chen, Jeffrey Dean, Efficient Estimation of Word Representations in Vector Space international conference on learning representations. ,(2013)
Ronan Collobert, Jason Weston, A unified architecture for natural language processing Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 160- 167 ,(2008) , 10.1145/1390156.1390177
A neural probabilistic language model Journal of Machine Learning Research. ,vol. 3, pp. 1137- 1155 ,(2003) , 10.1162/153244303322533223
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman, Indexing by Latent Semantic Analysis Journal of the Association for Information Science and Technology. ,vol. 41, pp. 391- 407 ,(1990) , 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
Jeffrey Pennington, Richard Socher, Christopher Manning, Glove: Global Vectors for Word Representation empirical methods in natural language processing. pp. 1532- 1543 ,(2014) , 10.3115/V1/D14-1162
Manoj Pooleery, Ramy Eskander, Nizar Habash, Owen Rambow, Ahmed El Kholy, Mona Diab, Ryan Roth, Arfath Pasha, Mohamed Al-Badrashiny, MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic language resources and evaluation. pp. 1094- 1101 ,(2014)
Christian Scheible, Maximilian Köper, Sabine Schulte im Walde, Multilingual Reliability and "Semantic" Structure of Continuous Word Spaces Proceedings of the 11th International Conference on Computational Semantics. pp. 40- 45 ,(2015)
Andrea Esuli, Giacomo Berardi, Diego Marcheggiani, Word Embeddings Go to Italy: A Comparison of Models and Training Datasets. IIR. ,(2015)
Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov, Enriching Word Vectors with Subword Information Transactions of the Association for Computational Linguistics. ,vol. 5, pp. 135- 146 ,(2017) , 10.1162/TACL_A_00051