Weighted Tanimoto Extreme Learning Machine with Case Study in Drug Discovery

作者: Wojciech Marian Czarnecki

DOI: 10.1109/MCI.2015.2437312

关键词:

摘要: Machine learning methods are becoming more and popular in the field of computer-aided drug design. The specific data characteristic, including sparse, binary representation as well noisy, imbalanced datasets, presents a challenging classification problem. Currently, two most successful models such tasks Support Vector (SVM) Random Forest (RF). In this paper, we introduce Weighted Tanimoto Extreme Learning (T-WELM), an extremely simple fast method for predicting chemical compound biological activity possibly other with discrete, representation. We show some theoretical properties proposed model ability to learn arbitrary sets examples. Further analysis shows numerous advantages T-WELM over SVMs, RFs traditional Machines (ELM) particular task. Experiments performed on 40 large datasets thousands compounds that T-WELMs achieve much better results at same time faster terms both training further than ELM state-of-the-art field.

参考文章(45)
Tomaso Poggio, Federico Girosi, A Theory of Networks for Approximation and Learning Massachusetts Institute of Technology. ,(1989)
Sebastian G. Rohrer, Knut Baumann, Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data. Journal of Chemical Information and Modeling. ,vol. 49, pp. 169- 184 ,(2009) , 10.1021/CI8002649
NAN-YING LIANG, PARAMASIVAN SARATCHANDRAN, GUANG-BIN HUANG, NARASIMHAN SUNDARARAJAN, Classification of mental tasks from EEG signals using extreme learning machine. International Journal of Neural Systems. ,vol. 16, pp. 29- 38 ,(2006) , 10.1142/S0129065706000482
Lowell H. Hall, Lemont B. Kier, Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State Information Journal of Chemical Information and Computer Sciences. ,vol. 35, pp. 1039- 1045 ,(1995) , 10.1021/CI00028A014
Cristina Bosco, Viviana Patti, Andrea Bolioli, Developing Corpora for Sentiment Analysis: The Case of Irony and Senti-TUT IEEE Intelligent Systems. ,vol. 28, pp. 55- 63 ,(2013) , 10.1109/MIS.2013.28
J.A.K. Suykens, J. De Brabanter, L. Lukas, J. Vandewalle, Weighted least squares support vector machines: robustness and sparse approximation Neurocomputing. ,vol. 48, pp. 85- 105 ,(2002) , 10.1016/S0925-2312(01)00644-0
Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504
Thomas Scior, Andreas Bender, Gary Tresadern, José L. Medina-Franco, Karina Martínez-Mayorga, Thierry Langer, Karina Cuanalo-Contreras, Dimitris K. Agrafiotis, Recognizing Pitfalls in Virtual Screening: A Critical Review Journal of Chemical Information and Modeling. ,vol. 52, pp. 867- 881 ,(2012) , 10.1021/CI200528D
Wei Duan, Xiaogang Qiu, Zhidong Cao, Xiaolong Zheng, Kainan Cui, Heterogeneous and Stochastic Agent-Based Models for Analyzing Infectious Diseases' Super Spreaders IEEE Intelligent Systems. ,vol. 28, pp. 18- 25 ,(2013) , 10.1109/MIS.2013.29
Weiwei Zong, Guang-Bin Huang, Face recognition based on extreme learning machine Neurocomputing. ,vol. 74, pp. 2541- 2551 ,(2011) , 10.1016/J.NEUCOM.2010.12.041