The use of text-mining and machine learning algorithms in systematic reviews: reducing workload in preclinical biomedical sciences and reducing human screening error

作者: Alexandra Bannach-Brown , Piotr Przybyła , James Thomas , Andrew SC Rice , Sophia Ananiadou

DOI: 10.1101/255760

关键词:

摘要: Background: In this paper we outline a method of applying machine learning (ML) algorithms to aid citation screening in an on-going broad and shallow systematic review, with the aim achieving high performing algorithm comparable human screening. Methods: We tested range algorithms. applied ML incremental numbers training records recorded performance on sensitivity specificity unseen validation set papers. The these was assessed measures recall, specificity, accuracy. classification results best taken forward remaining dataset will be next stage review. used identify potential errors during by analysing datasets against machine-ranked score. Results: found that perform at desirable level. Classifiers reached 98.7% based from 5749 records, inclusion prevalence 13.2%. highest level 86%. Human were successfully identified using scores highlight discrepancies. Training corrected improved without compromising sensitivity. Error analysis sees 3% increase or change which increases precision accuracy algorithm. Conclusions: technique error needs investigated more depth, however pilot shows promising approach integrating decisions automation review methodology.

参考文章(37)
Aaron M Cohen, William R Hersh, Kim Peterson, Po-Yin Yen, Reducing workload in systematic review preparation using automated citation classification. Journal of the American Medical Informatics Association. ,vol. 13, pp. 206- 219 ,(2006) , 10.1197/JAMIA.M1929
Yoshimasa Tsuruoka, Yuka Tateishi, Jin-Dong Kim, Tomoko Ohta, John McNaught, Sophia Ananiadou, Jun’ichi Tsujii, Developing a Robust Part-of-Speech Tagger for Biomedical Text Advances in Informatics. pp. 382- 392 ,(2005) , 10.1007/11573036_36
Rob B M de Vries, Carlijn R Hooijmans, Alice Tillema, Marlies Leenaars, Merel Ritskes-Hoitinga, Updated version of the Embase search filter for animal studies. Laboratory Animals. ,vol. 48, pp. 88- ,(2014) , 10.1177/0023677213494374
Tingting Mu, John Y. Goulermas, Ioannis Korkontzelos, Sophia Ananiadou, Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities association for information science and technology. ,vol. 67, pp. 106- 133 ,(2016) , 10.1002/ASI.23374
Hinrich Schütze, Christopher D. Manning, Prabhakar Raghavan, Introduction to Information Retrieval ,(2005)
Ian Shemilt, Antonia Simon, Gareth J. Hollands, Theresa M. Marteau, David Ogilvie, Alison O'Mara‐Eves, Michael P. Kelly, James Thomas, Pinpointing needles in giant haystacks: use of text mining to reduce impractical screening workload in extremely large scoping reviews Research Synthesis Methods. ,vol. 5, pp. 31- 49 ,(2014) , 10.1002/JRSM.1093
David M Blei, Andrew Y Ng, Michael I Jordan, None, Latent dirichlet allocation Journal of Machine Learning Research. ,vol. 3, pp. 993- 1022 ,(2003) , 10.5555/944919.944937
Lutz Bornmann, Rüdiger Mutz, Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references association for information science and technology. ,vol. 66, pp. 2215- 2222 ,(2015) , 10.1002/ASI.23329
James Thomas, John McNaught, Sophia Ananiadou, Applications of text mining within systematic reviews Research Synthesis Methods. ,vol. 2, pp. 1- 14 ,(2011) , 10.1002/JRSM.27