Inconsistent Node Flattening for Improving Top-Down Hierarchical Classification

作者: Azad Naik , Huzefa Rangwala

DOI: 10.1109/DSAA.2016.47

关键词: Image (mathematics)Data miningComputer scienceTop-down and bottom-up designSource codeNode (networking)Process (computing)HierarchyPropagation of uncertaintyClass (biology)

摘要: Large-scale classification of data where classes are structurally organized in a hierarchy is an important area research. Top-down approaches that exploit the during learning and prediction phase efficient for large-scale hierarchical classification. However, accuracy top-down poor due to error propagation i.e., errors made at higher levels cannot be corrected lower levels. One main reason behind presence inconsistent nodes introduced arbitrary process creating these hierarchies by domain experts. In this paper, we propose two different data-driven (local global) structure modification identifies flattens present within hierarchy. Our extensive empirical evaluation proposed on several image text datasets with varying distribution features, training instances per class shows improved performance over competing approaches. Specifically, see improvement upto 7% Macro-F1 score our approach best TD baseline. SOURCE CODE: http://www.cs.gmu.edu/ mlbio/InconsistentNodeFlattening.

参考文章(30)
Rohit Babbar, Ioannis Partalas, Eric Gaussier, Massih-Reza Amini, Maximum-Margin Framework for Training Data Synchronization in Large-Scale Hierarchical Classification international conference on neural information processing. pp. 336- 343 ,(2013) , 10.1007/978-3-642-42054-2_42
Andrew McCallum, Ronald Rosenfeld, Thomas Mitchell, Andrew Y Ng, None, Improving Text Classification by Shrinkage in a Hierarchy of Classes international conference on machine learning. pp. 359- 367 ,(1998)
Rayid Ghani, Using Error-Correcting Codes for Text Classification international conference on machine learning. pp. 303- 310 ,(2000)
Mehran Sahami, Daphne Koller, Hierarchically Classifying Documents Using Very Few Words international conference on machine learning. pp. 170- 178 ,(1997)
Celine Vens, Jan Struyf, Leander Schietgat, Sašo Džeroski, Hendrik Blockeel, Decision trees for hierarchical multi-label classification Machine Learning. ,vol. 73, pp. 185- 214 ,(2008) , 10.1007/S10994-008-5077-3
Yiming Yang, Xin Liu, A re-examination of text categorization methods international acm sigir conference on research and development in information retrieval. pp. 42- 49 ,(1999) , 10.1145/312624.312647
Ofer Dekel, Joseph Keshet, Yoram Singer, Large margin hierarchical classification Twenty-first international conference on Machine learning - ICML '04. pp. 27- ,(2004) , 10.1145/1015330.1015374
Lei Tang, Jianping Zhang, Huan Liu, Acclimatizing Taxonomic Semantics for Hierarchical Content Classification knowledge discovery and data mining. pp. 384- 393 ,(2006) , 10.1145/1150402.1150446
Lijuan Cai, Thomas Hofmann, Hierarchical document categorization with support vector machines conference on information and knowledge management. pp. 78- 87 ,(2004) , 10.1145/1031171.1031186
Tie-Yan LIU, Yiming YANG, Hao WAN, Qian ZHOU, Bin GAO, Hua-Jun ZENG, Zheng CHEN, Wei-Ying MA, An experimental study on large-scale web categorization Special interest tracks and posters of the 14th international conference on World Wide Web - WWW '05. pp. 1106- 1107 ,(2005) , 10.1145/1062745.1062891