Use of Random forest in the identification of important variables

作者: Betina PO Lovatti , Márcia HC Nascimento , Álvaro C Neto , Eustaquio VR Castro , Paulo R Filgueiras

DOI: 10.1016/J.MICROC.2018.12.028

关键词: Identification (information)Crude oilVariable (computer science)Artificial intelligenceMathematicsPattern recognitionPour pointProton NMRCarbon-13 NMRSelection (genetic algorithm)Random forest

摘要: Abstract Random Forest (RF) technique has been shown to be promising in the supervised classification applied different matrices. However, approaches identifying significant variables that weight model are scarce, problems. In this paper, we propose a methodology for selection of greater relevance construction RF models. For application methodology, models were developed discriminating crude oil samples, about their maximum pour point (MPP). sense, data from MPP (ASTM D5853) 105 hydrogen (1H) NMR spectra and carbon (13C) acquired. With ranging −54 °C 39 °C, two classes assigned: first containing 43 samples with value ≤ −9 °C, and, second, 62 value > −9 °C. The 1H models, 90% accuracy, 13C NMR, 71% used variable method. results showed proposed select was effective distinction best contributed discrimination oils. Therefore, new tool enabled understanding interest chemical information, contained its relationship property samples.

参考文章(41)
R. J. Barnes, M. S. Dhanoa, Susan J. Lister, Standard Normal Variate Transformation and De-trending of Near-Infrared Diffuse Reflectance Spectra Applied Spectroscopy. ,vol. 43, pp. 772- 777 ,(1989) , 10.1366/0003702894202201
Giorgio Tomasi, Francesco Savorani, Søren B. Engelsen, icoshift: An effective tool for the alignment of chromatographic data. Journal of Chromatography A. ,vol. 1218, pp. 7832- 7840 ,(2011) , 10.1016/J.CHROMA.2011.08.086
Jung Hwan Cho, Pradeep U. Kurup, Decision tree approach for classification and dimensionality reduction of electronic nose data Sensors and Actuators B-chemical. ,vol. 160, pp. 542- 548 ,(2011) , 10.1016/J.SNB.2011.08.027
Saba Bashir, Usman Qamar, Farhan Hassan Khan, M. Younus Javed, MV5: A Clinical Decision Support Framework for Heart Disease Prediction Using Majority Vote Based Classifier Ensemble Arabian Journal for Science and Engineering. ,vol. 39, pp. 7771- 7783 ,(2014) , 10.1007/S13369-014-1315-0
Juan C Poveda, Daniel R Molina, None, Average molecular parameters of heavy crude oils and their fractions using NMR spectroscopy Journal of Petroleum Science and Engineering. ,vol. 84, pp. 1- 7 ,(2012) , 10.1016/J.PETROL.2012.01.005
Tom Fearn, Cecilia Riccioli, Ana Garrido-Varo, José Emilio Guerrero-Ginel, On the geometry of SNV and MSC Chemometrics and Intelligent Laboratory Systems. ,vol. 96, pp. 22- 26 ,(2009) , 10.1016/J.CHEMOLAB.2008.11.006
F. Savorani, G. Tomasi, S.B. Engelsen, icoshift: A versatile tool for the rapid alignment of 1D NMR spectra Journal of Magnetic Resonance. ,vol. 202, pp. 190- 202 ,(2010) , 10.1016/J.JMR.2009.11.012
Paul Geladi, Bruce R. Kowalski, Partial least-squares regression: a tutorial Analytica Chimica Acta. ,vol. 185, pp. 1- 17 ,(1986) , 10.1016/0003-2670(86)80028-9