Better prediction of aqueous solubility of chlorinated hydrocarbons using support vector machine modeling

作者: Behnoosh Bahadori , Morteza Atabati , Kobra Zarei

DOI: 10.1007/S10311-016-0561-7

关键词: Feature selectionAqueous solubilitySupport vector machineOrganic chemistryPollutantPartial least squares regressionPrincipal component regressionComputational chemistryChemistryQuantitative Structure Property Relationship

摘要: Remediation of water contaminated by organic pollutants is a major challenge, which could be improved better knowledge on the aqueous solubility compounds. Indeed, controls fate and toxicity pollutants. Here we performed structure–property study based genetic algorithm for prediction chlorinated hydrocarbons. 1497 descriptors were calculated with Dragon software. The variable selection method was used to select an optimal subset that have significant contribution overall solubility, from large pool descriptors. support vector machine then employed model possible quantitative relationships between selected solubility. Our results show total size, polarizability electronegativity modify We also found gave than other methods such as principal component regression partial least squares.

参考文章(31)
Roberto Todeschini, Viviana Consonni, Molecular descriptors for chemoinformatics Wiley-VCH. ,(2009)
Tatiana I. Netzeva, Andrew P. Worth, Tom Aldenberg, Romualdo Benigni, Mark T.D. Cronin, Paola Gramatica, Joanna S. Jaworska, Scott Kahn, Gilles Klopman, Carol A. Marchant, Glenn Myatt, Nina Nikolova-Jeliazkova, Grace Y. Patlewicz, Roger Perkins, David W. Roberts, Terry W. Schultz, David T. Stanton, Johannes J.M. van de Sandt, Weida Tong, Gilman Veith, Chihae Yang, Current status of methods for defining the applicability domain of (quantitative) structure-activity relationships. The report and recommendations of ECVAM Workshop 52. Atla-alternatives To Laboratory Animals. ,vol. 33, pp. 155- 173 ,(2005) , 10.1177/026119290503300209
Hugo Kubinyi, Variable Selection in QSAR Studies. II. A Highly Efficient Combination of Systematic Search and Evolution Quantitative Structure-activity Relationships. ,vol. 13, pp. 393- 401 ,(1994) , 10.1002/QSAR.19940130403
Leslie Cizmas, Virender K. Sharma, Cole M. Gray, Thomas J. McDonald, Pharmaceuticals and personal care products in waters: occurrence, toxicity, and risk Environmental Chemistry Letters. ,vol. 13, pp. 381- 394 ,(2015) , 10.1007/S10311-015-0524-4
Sabine Sarraute, Vladimír Dohnal, Vladimír Majer, Margarida Costa Gomes, Pavla Dohányosová, Aqueous Solubility and Related Thermodynamic Functions of Nonaromatic Hydrocarbons as a Function of Molecular Structure Industrial & Engineering Chemistry Research. ,vol. 43, pp. 2805- 2815 ,(2004) , 10.1021/IE030800T
H. X. Liu, R. S. Zhang, X. J. Yao, M. C. Liu, Z. D. Hu, B. T. Fan, Prediction of the isoelectric point of an amino acid based on GA-PLS and SVMs. Journal of Chemical Information and Computer Sciences. ,vol. 44, pp. 161- 167 ,(2004) , 10.1021/CI034173U
Yi Liao, Shu-Cherng Fang, Henry L.W. Nuttle, A neural network model with bounded-weights for pattern classification Computers & Operations Research. ,vol. 31, pp. 1411- 1426 ,(2004) , 10.1016/S0305-0548(03)00097-2
C.B. Lucasius, G. Kateman, Understanding and using genetic algorithms Part 1. Concepts, properties and context Chemometrics and Intelligent Laboratory Systems. ,vol. 19, pp. 1- 33 ,(1993) , 10.1016/0169-7439(93)80079-W