Citation Classification Using Natural Language Processing and Machine Learning Models

作者: Syyab Rahi , Iqra Safder , Sehrish Iqbal , Saeed-Ul Hassan , Iain Reid

DOI: 10.1007/978-3-030-53187-4_39

关键词:

摘要: In this paper, we address the problem of identifying quality citation as important or unimportant to developments presented in research papers. We gather features represented by four state-of-the-art machine learning techniques and combined them with newly engineered, natural language-based features. Using a known dataset 465 citations, manually labeled experts, our approach out-performed using fine-tuned Random Forest Classifier 90.7% F1 score 97.7% precision. also employ Convolutional Neural Networks AdamW optimizer focal loss function - that converges quickly on small data achieve considerably significant results.

参考文章(23)
Riza Theresa Batista-Navarro, Georgios Kontonatsios, Claudiu Mihăilă, Paul Thompson, Rafal Rak, Raheel Nawaz, Ioannis Korkontzelos, Sophia Ananiadou, Facilitating the analysis of discourse phenomena in an interoperable NLP platform international conference on computational linguistics. pp. 559- 571 ,(2013) , 10.1007/978-3-642-37247-6_45
Sophia Ananiadou, Paul Thompson, Raheel Nawaz, Identification of Manner in Bio-Events language resources and evaluation. pp. 3505- 3510 ,(2012)
Ying Ding, Guo Zhang, Tamy Chambers, Min Song, Xiaolong Wang, Chengxiang Zhai, None, Content-Based Citation Analysis: The Next Generation of Citation Analysis Journal of the Association for Information Science and Technology. ,vol. 65, pp. 1820- 1833 ,(2014) , 10.1002/ASI.23256
André Vellino, Daniel Lemire, Peter D. Turney, Xiaodan Zhu, Measuring academic influence: Not all citations are equal Journal of the Association for Information Science and Technology. ,vol. 66, pp. 408- 427 ,(2015) , 10.1002/ASI.23179
Yoon Kim, Convolutional Neural Networks for Sentence Classification empirical methods in natural language processing. pp. 1746- 1751 ,(2014) , 10.3115/V1/D14-1181
Sophia Ananiadou, Paul Thompson, Raheel Nawaz, Enhancing search: events and their discourse context international conference on computational linguistics. pp. 318- 334 ,(2013) , 10.1007/978-3-642-37256-8_27
Paul Thompson, Raheel Nawaz, Ioannis Korkontzelos, William Black, John McNaught, Sophia Ananiadou, None, News search using discourse analytics digital heritage international congress. ,vol. 1, pp. 597- 604 ,(2013) , 10.1109/DIGITALHERITAGE.2013.6743801
Simone Teufel, Advaith Siddharthan, Dan Tidhar, Automatic classification of citation function empirical methods in natural language processing. pp. 103- 110 ,(2006) , 10.3115/1610075.1610091
Raheel Nawaz, Paul Thompson, Sophia Ananiadou, None, Negated bio-events: analysis and identification. BMC Bioinformatics. ,vol. 14, pp. 14- 14 ,(2013) , 10.1186/1471-2105-14-14
Xinglong Wang, Rafal Rak, Angelo Restificar, Chikashi Nobata, CJ Rupp, Riza Theresa B Batista-Navarro, Raheel Nawaz, Sophia Ananiadou, None, Detecting experimental techniques and selecting relevant documents for protein-protein interactions from biomedical literature BMC Bioinformatics. ,vol. 12, pp. 1- 13 ,(2011) , 10.1186/1471-2105-12-S8-S11