Automated approaches for classifying structures

作者: Mukund Deshpande , Michihiro Kuramochi , George Karypis

DOI: 10.21236/ADA439498

关键词: Support vector machineClassifier (UML)Domain knowledgeComputer scienceMachine learningArtificial intelligence

摘要: In this paper we study the problem of classifying chemical compound datasets. We present an algorithm that first mines dataset to discover discriminating sub-structures; these sub-structures are used as features build a powerful classifier. The advantage our classification technique is it requires very little domain knowledge and can easily handle large evaluated performance classifier on two widely available datasets have found give good results.

参考文章(20)
Isidore Rigoutsos, Dennis Shasha, Kaizhong Zhang, Bruce Shapiro, Xiong Wang, Jason T. L. Wang, Sitaram Dikshitulu, Automated discovery of active motifs in three dimensional molecules knowledge discovery and data mining. pp. 89- 95 ,(1997)
S. H. Muggleton, M. J. E. Sternberg, A. Srinivasan, R. D. King, The predictive toxicology evaluation challenge international joint conference on artificial intelligence. pp. 4- 9 ,(1997)
Ramakrishnan Srikant, Rakesh Agrawal, Fast algorithms for mining association rules very large data bases. pp. 580- 592 ,(1998)
Ramakrishnan Srikant, Rakesh Agrawal, Fast Algorithms for Mining Association Rules in Large Databases very large data bases. pp. 487- 499 ,(1994)
Peter Brockhausen, Thorsten Joachims, Katharina Morik, Combining Statistical Learning with a Knowledge-Based Approach - A Case Study in Intensive Care Monitoring international conference on machine learning. pp. 268- 277 ,(1999)
Akihiro Inokuchi, Takashi Washio, Hiroshi Motoda, An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data european conference on principles of data mining and knowledge discovery. pp. 13- 23 ,(2000) , 10.1007/3-540-45372-5_2
Aijun An, Yuanyuan Wang, Comparisons of classification methods for screening potential compounds international conference on data mining. pp. 11- 18 ,(2001) , 10.1109/ICDM.2001.989495
David Weininger, SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules Journal of Chemical Information and Computer Sciences. ,vol. 28, pp. 31- 36 ,(1988) , 10.1021/CI00057A005
Owen S. Weislow, Rebecca Kiser, Donald L. Fine, John Bader, Robert H. Shoemaker, Michael R. Boyd, New Soluble-Formazan Assay for HIV-1 Cytopathic Effects: Application to High-Flux Screening of Synthetic and Natural Products for AIDS-Antiviral Activity Journal of the National Cancer Institute. ,vol. 81, pp. 577- 586 ,(1989) , 10.1093/JNCI/81.8.577