Toward robust QSPR models: Synergistic utilization of robust regression and variable elimination

作者: Rainer Grohmann , Torsten Schindler

DOI: 10.1002/JCC.20831

关键词:

摘要: Widely used regression approaches in modeling quantitative structure-property relationships, such as PLS regression, are highly susceptible to outlying observations that will impair the prognostic value of a model. Our aim is compile homogeneous datasets basis for by removing compounds and applying variable selection. We investigate different create robust, outlier-resistant models field prediction drug molecules' permeability. The objective join strength outlier detection elimination increasing predictive power models. In conclusion, employed identify multiple, data subsets modeling.

参考文章(18)
Sijmen de Jong, SIMPLS: an alternative approach to partial least squares regression Chemometrics and Intelligent Laboratory Systems. ,vol. 18, pp. 251- 263 ,(1993) , 10.1016/0169-7439(93)85002-X
Katrien Van Driessen, Peter J. Rousseeuw, A fast algorithm for the minimum covariance determinant estimator Technometrics. ,vol. 41, pp. 212- 223 ,(1999) , 10.2307/1270566
William J. Egan, Georgio Lauri, Prediction of intestinal permeability. Advanced Drug Delivery Reviews. ,vol. 54, pp. 273- 289 ,(2002) , 10.1016/S0169-409X(02)00004-2
Ruifeng Liu, Hongmao Sun, Sung-Sau So, Development of quantitative structure-property relationship models for early adme evaluation in drug discovery. 2. Blood-brain barrier penetration Journal of Chemical Information and Computer Sciences. ,vol. 41, pp. 1623- 1632 ,(2001) , 10.1021/CI010290I
Evgeny Byvatov, Gisbert Schneider, SVM-based feature selection for characterization of focused compound collections. Journal of Chemical Information and Computer Sciences. ,vol. 44, pp. 993- 999 ,(2004) , 10.1021/CI0342876
Mia Hubert, Peter J Rousseeuw, Karlien Vanden Branden, ROBPCA: A New Approach to Robust Principal Component Analysis Technometrics. ,vol. 47, pp. 64- 79 ,(2005) , 10.1198/004017004000000563
Svante Wold, Pattern recognition by means of disjoint principal components models Pattern Recognition. ,vol. 8, pp. 127- 139 ,(1976) , 10.1016/0031-3203(76)90014-5
Franco Lombardo, James F. Blake, William J. Curatolo, Computation of brain-blood partitioning of organic solutes via free energy calculations. Journal of Medicinal Chemistry. ,vol. 39, pp. 4750- 4755 ,(1996) , 10.1021/JM960163R
Han van de Waterbeemd, Eric Gifford, ADMET in silico modelling: towards prediction paradise? Nature Reviews Drug Discovery. ,vol. 2, pp. 192- 204 ,(2003) , 10.1038/NRD1032