Uncertainty sampling and transductive experimental design for active dual supervision

作者: Vikas Sindhwani , Prem Melville , Richard D. Lawrence

DOI: 10.1145/1553374.1553496

关键词:

摘要: Dual supervision refers to the general setting of learning from both labeled examples as well features. Labeled features are naturally available in tasks such text classification where it is frequently possible provide domain knowledge form words that associate strongly with a class. In this paper, we consider novel problem active dual supervision, or, how optimally query an example and feature labeling oracle simultaneously collect two different forms objective building best classifier most cost effective manner. We apply classical uncertainty experimental design based schemes graph/kernel-based models. Empirical studies confirm potential these significantly reduce acquiring data for training high-quality

参考文章(16)
Nicholas Roy, Andrew McCallum, Toward Optimal Active Learning through Sampling Estimation of Error Reduction international conference on machine learning. pp. 441- 448 ,(2001)
Mikhail Belkin, Irina Matveeva, Partha Niyogi, Regularization and Semi-supervised Learning on Large Graphs Learning Theory. pp. 624- 638 ,(2004) , 10.1007/978-3-540-27819-1_43
Alexander J Smola, Risi Kondor, None, Kernels and Regularization on Graphs Learning Theory and Kernel Machines. pp. 144- 158 ,(2003) , 10.1007/978-3-540-45167-9_12
Shantanu Godbole, Abhay Harpale, Sunita Sarawagi, Soumen Chakrabarti, Document classification through interactive supervision of document and term labels european conference on principles of data mining and knowledge discovery. pp. 185- 196 ,(2004) , 10.1007/978-3-540-30116-5_19
Richard D. Lawrence, Prem Melville, Wojciech Gryc, Sentiment analysis of blogs by combining lexical knowledge with text classification Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09. pp. 1275- 1284 ,(2009) , 10.1145/1557019.1557156
Ngoc-Diep Ho, Paul Van Dooren, On the pseudo-inverse of the Laplacian of a bipartite graph Applied Mathematics Letters. ,vol. 18, pp. 917- 922 ,(2005) , 10.1016/J.AML.2004.07.034
Amir Globerson, Naftali Tishby, Fernando Pereira, Gal Chechik, Euclidean Embedding of Co-occurrence Data Journal of Machine Learning Research. ,vol. 8, pp. 2265- 2295 ,(2007)
Kai Yu, Jinbo Bi, Volker Tresp, Active learning via transductive experimental design Proceedings of the 23rd international conference on Machine learning - ICML '06. pp. 1081- 1088 ,(2006) , 10.1145/1143844.1143980
Gregory Druck, Gideon Mann, Andrew McCallum, Learning from labeled features using generalized expectation criteria Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08. pp. 595- 602 ,(2008) , 10.1145/1390334.1390436