A genetic algorithm for Hierarchical Multi-Label Classification

作者: Ricardo Cerri , Rodrigo C Barros , Andre CPLF de Carvalho , None

DOI: 10.1145/2245276.2245325

关键词:

摘要: In Hierarchical Multi-Label Classification (HMC) problems, each example can be classified into two or more classes simultaneously, differently from standard classification. Moreover, the are structured in a hierarchy, form of either tree directed acyclic graph. Therefore, an assigned to paths hierarchical structure, resulting complex classification problem with possibly hundreds thousands classes. Several methods have been proposed deal such some them employing single classifier all simultaneously (global methods), and others many classifiers decompose original set subproblems (local methods). this work, we propose novel global method called HMC-GA, which employs genetic algorithm for solving HMC problem. our approach, evolves antecedents rules, order optimize level coverage antecedent. Then, optimized is selected build corresponding consequent rules (set predicted). Our compared state-of-the-art algorithms, protein function prediction datasets. The experimental results show that approach presents competitive predictive accuracy, suggesting algorithms constitute promising alternative multi-label biological data.

参考文章(30)
Svetlana Kiritchenko, Fazel Famili, Stan Matwin, Functional Annotation of Genes Using Hierarchical Text Categorization ,(2005)
Ivica Dimitrovski, Dragi Kocev, Suzana Loskovska, Sašo Džeroski, Detection of visual concepts and annotation of images using ensembles of trees for hierarchical multi-label classification international conference on pattern recognition. pp. 152- 161 ,(2010) , 10.1007/978-3-642-17711-8_16
Luc De Raedt, Hendrik Blockeel, Jan Ramon, Top-Down Induction of Clustering Trees international conference on machine learning. pp. 55- 63 ,(1998)
Hendrik Blockeel, Leander Schietgat, Jan Struyf, Sašo Džeroski, Amanda Clare, Decision Trees for Hierarchical Multilabel Classification: A Case Study in Functional Genomics Lecture Notes in Computer Science. ,vol. 4213, pp. 18- 29 ,(2006) , 10.1007/11871637_7
Janez Demšar, Statistical Comparisons of Classifiers over Multiple Data Sets Journal of Machine Learning Research. ,vol. 7, pp. 1- 30 ,(2006)
Jan Struyf, Sašo Džeroski, Hendrik Blockeel, Amanda Clare, Hierarchical Multi-classification with Predictive Clustering Trees in Functional Genomics Progress in Artificial Intelligence. ,vol. 3808, pp. 272- 283 ,(2005) , 10.1007/11595014_27
Roberto T. Alves, Myriam R. Delgado, Alex A. Freitas, Multi-label Hierarchical Classification of Protein Functions with Artificial Immune Systems Advances in Bioinformatics and Computational Biology. pp. 1- 12 ,(2008) , 10.1007/978-3-540-85557-6_1
Lijuan Cai, Thomas Hofmann, Exploiting known taxonomies in learning overlapping concepts international joint conference on artificial intelligence. pp. 714- 719 ,(2007)
Celine Vens, Jan Struyf, Leander Schietgat, Sašo Džeroski, Hendrik Blockeel, Decision trees for hierarchical multi-label classification Machine Learning. ,vol. 73, pp. 185- 214 ,(2008) , 10.1007/S10994-008-5077-3
Jesse Davis, Mark Goadrich, The relationship between Precision-Recall and ROC curves Proceedings of the 23rd international conference on Machine learning - ICML '06. ,vol. 148, pp. 233- 240 ,(2006) , 10.1145/1143844.1143874