作者: Jihyeun Lee , Surendra Kumar , Sang-Yoon Lee , Sung Jean Park , Mi-hyun Kim
关键词:
摘要: S100A9 is a potential therapeutic target for various disease including prostate cancer, colorectal and Alzheimer's disease. However, the sparsity of atomic level data, such as protein-protein interaction with RAGE, TLR4/MD2, or CD147 (EMMPRIN) hinders rational drug design inhibitors. Herein we first report predictive models inhibitory effect by applying machine learning classifiers on 2D-molecular descriptors. The were optimized through feature selectors well to produce top eight random forest robust predictability high cost-effectiveness. Notably, optimal sets obtained after reduction 2,798 features into dozens chopping fingerprint bits. Moreover, efficiency compact allowed us further screen large-scale dataset (over 6,000,000 compounds) within week. Through consensus vote models, 46 hits (hit rate = 0.000713%) identified We expect that our will facilitate discovery process providing power cost-reduction ability give insights designing novel drugs targeting S100A9.