作者: Muhammad Jawad Hussain , Shahbaz Hassan Wasti , Guangjian Huang , Yuncheng Jiang
DOI: 10.1016/J.DIB.2020.105377
关键词:
摘要: Abstract This data article compiles the detailed and descriptive experimental of Wikipedia-based semantic similarity approach called as Neighbourhood Aggregated Semantic Contribution (NASC), presented in Husain, et al. [1]. The JWPL (Java Wikipedia Library)-DataMachine WikipediaAPI are used to extract required features from dump. dataset presents disambiguated concepts gold standard word benchmarks MC30 (English), RG65es (Spanish) RG65fr (French) their associated set categories corresponding category graph (WCG). also contains number ancestors, common pages, pages k-neighbourhood for different levels parameter k English, Spanish, French WCGs. can be assess between English (MC30), Spanish (RG65es), (RG65fr) languages benchmarks. Moreover, will useful further analysis comparison taxonomic structures