Experimental data for computing semantic similarity between concepts using multiple inheritances in Wikipedia category graph.

作者: Muhammad Jawad Hussain , Shahbaz Hassan Wasti , Guangjian Huang , Yuncheng Jiang

DOI: 10.1016/J.DIB.2020.105377

关键词:

摘要: Abstract This data article compiles the detailed and descriptive experimental of Wikipedia-based semantic similarity approach called as Neighbourhood Aggregated Semantic Contribution (NASC), presented in Husain, et al. [1]. The JWPL (Java Wikipedia Library)-DataMachine WikipediaAPI are used to extract required features from dump. dataset presents disambiguated concepts gold standard word benchmarks MC30 (English), RG65es (Spanish) RG65fr (French) their associated set categories corresponding category graph (WCG). also contains number ancestors, common pages, pages k-neighbourhood for different levels parameter k English, Spanish, French WCGs. can be assess between English (MC30), Spanish (RG65es), (RG65fr) languages benchmarks. Moreover, will useful further analysis comparison taxonomic structures

参考文章(9)
Anna Formica, Ontology-based concept similarity in Formal Concept Analysis Information Sciences. ,vol. 176, pp. 2624- 2641 ,(2006) , 10.1016/J.INS.2005.11.014
Yuncheng Jiang, Xiaopei Zhang, Yong Tang, Ruihua Nie, Feature-based approaches to semantic similarity assessment of concepts using Wikipedia Information Processing and Management. ,vol. 51, pp. 215- 234 ,(2015) , 10.1016/J.IPM.2015.01.001
Zvi Galil, Efficient algorithms for finding maximum matching in graphs ACM Computing Surveys. ,vol. 18, pp. 23- 38 ,(1986) , 10.1145/6462.6502
George A. Miller, Walter G. Charles, Contextual correlates of semantic similarity Language and Cognitive Processes. ,vol. 6, pp. 1- 28 ,(1991) , 10.1080/01690969108406936
Iryna Gurevych, Using the Structure of a Conceptual Network in Computing Semantic Relatedness Lecture Notes in Computer Science. pp. 767- 778 ,(2005) , 10.1007/11562214_67
José Camacho-Collados, Mohammad Taher Pilehvar, Roberto Navigli, A Framework for the Construction of Monolingual and Cross-lingual Word Similarity Datasets Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). ,vol. 2, pp. 1- 7 ,(2015) , 10.3115/V1/P15-2001
Yuncheng Jiang, Wen Bai, Xiaopei Zhang, Jiaojiao Hu, Wikipedia-based information content and semantic similarity computation Information Processing and Management. ,vol. 53, pp. 248- 265 ,(2017) , 10.1016/J.IPM.2016.09.001
Shahbaz Hassan Wasti, Muhammad Jawad Hussain, Guangjian Huang, Aftab Akram, Yuncheng Jiang, Yong Tang, Assessing semantic similarity between concepts: A weighted‐feature‐based approach Concurrency and Computation: Practice and Experience. ,vol. 32, ,(2020) , 10.1002/CPE.5594
Muhammad Jawad Hussain, Shahbaz Hassan Wasti, Guangjian Huang, Lina Wei, Yuncheng Jiang, Yong Tang, An approach for measuring semantic similarity between Wikipedia concepts using multiple inheritances Information Processing and Management. ,vol. 57, pp. 102188- ,(2020) , 10.1016/J.IPM.2019.102188