Classifying Documents within Multiple Hierarchical Datasets Using Multi-task Learning

作者: Azad Naik , Anveshi Charuvaka , Huzefa Rangwala

DOI: 10.1109/ICTAI.2013.65

关键词:

摘要: Multi-task learning (MTL) is a supervised paradigm in which the prediction models for several related tasks are learned jointly to achieve better generalization performance. When there only few training examples per task, MTL considerably outperforms traditional Single task (STL) terms of accuracy. In this work we develop an based approach classifying documents that archived within dual concept hierarchies, namely, DMOZ and Wikipedia. We solve multi-class classification problem by defining one-versus-rest binary each different classes across two hierarchical datasets. Instead linear discriminant independently, use with relationships between datasets established using non-parametric, lazy, nearest neighbor approach. also evaluate transfer (TL) compare (and TL) methods against standard single semi-supervised approaches. Our empirical results demonstrate strength our developed show improvement especially when fewer number task.

参考文章(22)
Nitin Bhatia, Vandana, Survey of Nearest Neighbor Techniques arXiv: Computer Vision and Pattern Recognition. ,(2010)
Shuiwang Ji, Jun Liu, Jieping Ye, Multi-task feature learning via efficient l 2, 1 -norm minimization uncertainty in artificial intelligence. pp. 339- 348 ,(2009)
Jean-philippe Vert, Francis R. Bach, Laurent Jacob, Clustered Multi-Task Learning: A Convex Formulation neural information processing systems. ,vol. 21, pp. 745- 752 ,(2008)
Tony Jebara, Multi-task feature and kernel selection for SVMs Twenty-first international conference on Machine learning - ICML '04. pp. 55- ,(2004) , 10.1145/1015330.1015426
Jiayu Zhou, Lei Yuan, Jun Liu, Jieping Ye, A multi-task learning formulation for predicting disease progression Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11. pp. 814- 822 ,(2011) , 10.1145/2020408.2020549
Shai Ben-David, Reba Schuller, Exploiting Task Relatedness for Multiple Task Learning conference on learning theory. pp. 567- 580 ,(2003) , 10.1007/978-3-540-45167-9_41
Andreas Argyriou, Theodoros Evgeniou, Massimiliano Pontil, Convex multi-task feature learning Machine Learning. ,vol. 73, pp. 243- 272 ,(2008) , 10.1007/S10994-007-5040-8
P. Baldi, S. Brunak, Y. Chauvin, C. A. F. Andersen, H. Nielsen, Assessing the accuracy of prediction algorithms for classification: an overview Bioinformatics. ,vol. 16, pp. 412- 424 ,(2000) , 10.1093/BIOINFORMATICS/16.5.412
Yiming Yang, An Evaluation of Statistical Approaches to Text Categorization Information Retrieval. ,vol. 1, pp. 69- 90 ,(1999) , 10.1023/A:1009982220290
Tony Jebara, Multitask Sparsity via Maximum Entropy Discrimination Journal of Machine Learning Research. ,vol. 12, pp. 75- 110 ,(2011)