关键词:
摘要: Machine learning methods are becoming more and popular in the field of computer-aided drug design. The specific data characteristic, including sparse, binary representation as well noisy, imbalanced datasets, presents a challenging classification problem. Currently, two most successful models such tasks Support Vector (SVM) Random Forest (RF). In this paper, we introduce Weighted Tanimoto Extreme Learning (T-WELM), an extremely simple fast method for predicting chemical compound biological activity possibly other with discrete, representation. We show some theoretical properties proposed model ability to learn arbitrary sets examples. Further analysis shows numerous advantages T-WELM over SVMs, RFs traditional Machines (ELM) particular task. Experiments performed on 40 large datasets thousands compounds that T-WELMs achieve much better results at same time faster terms both training further than ELM state-of-the-art field.