Enhancing random forest classification with NLP in DAMEH: A system for DAta Management in eHealth Domain

作者: Luigi Coppolino , Giovanni Cozzolino , Flora Amato , Roberto Nardone , Giovanni Mazzeo

DOI: 10.1016/J.NEUCOM.2020.08.091

关键词: Random forestNatural language processingeHealthStatistical classificationArtificial intelligenceComputer scienceField (computer science)Data collectionData managementNatural languageWearable computer

摘要: Abstract The use of pervasive IoT devices in Smart Cities, have increased the Volume data produced many and field. Interesting very useful applications grow up number E-health domain, where smart are used order to manage huge amount data, highly distributed environments, provide services able collect fill medical records patients. problem here is gather produce analyze depending on their contents. Since gathering involve different (not only wearable sensors, but also environmental devices, like weather, pollution other sensors) it difficult classify contents, enable better management Data from couple with written natural language: we describe an architecture that determine best features for classification, existent records. based pre-filtering phase Natural Language Processing, enhance Machine learning classification Random Forests. We carried experiments about 5000 real (anonymized) case studies various health-care organizations Italy. show accuracy presented approach terms Accuracy-Rejection curves.

参考文章(56)
Flora Amato, Mario Barbareschi, Valentina Casola, Antonino Mazzeo, An FPGA-Based Smart Classifier for Decision Support Systems Studies in Computational Intelligence. pp. 289- 299 ,(2014) , 10.1007/978-3-319-01571-2_34
Flora Amato, Mario Barbareschi, Valentina Casola, Antonino Mazzeo, Sara Romano, Towards Automatic Generation of Hardware Classifiers Algorithms and Architectures for Parallel Processing. pp. 125- 132 ,(2013) , 10.1007/978-3-319-03889-6_14
William B. Dolan, John J. Messerly, Stephen D. Richardson, George E. Heidorn, Karen Jensen, Information retrieval utilizing semantic representation of text ,(1998)
Christopher D. Manning, Prabhakar Raghavan, Hinrich Schutze, Scoring, term weighting, and the vector space model Introduction to Information Retrieval. pp. 100- 123 ,(2008) , 10.1017/CBO9780511809071.007
Jayaram Bhasker, A VHDL primer ,(1995)
George Hripcsak, Carol Friedman, Philip O Alderson, William DuMouchel, Stephen B Johnson, Paul D Clayton, Unlocking Clinical Data from Narrative Reports: A Study of Natural Language Processing Annals of Internal Medicine. ,vol. 122, pp. 681- 688 ,(1995) , 10.7326/0003-4819-122-9-199505010-00007
Suzanne Stevenson, Eric Joanis, Semi-supervised verb class discovery using noisy features north american chapter of the association for computational linguistics. pp. 71- 78 ,(2003) , 10.3115/1119176.1119186
Roma Chauhan, Amit Kumar, Cloud computing for improved healthcare: Techniques, potential and challenges e health and bioengineering conference. pp. 1- 4 ,(2013) , 10.1109/EHB.2013.6707234
M Pallikonda Rajasekaran, S Radhakrishnan, P Subbaraj, None, Sensor grid applications in patient monitoring Future Generation Computer Systems. ,vol. 26, pp. 569- 575 ,(2010) , 10.1016/J.FUTURE.2009.11.001