Active Feature-Value Acquisition

作者: Maytal Saar-Tsechansky , Prem Melville , Foster Provost , None

DOI: 10.1287/MNSC.1080.0952

关键词:

摘要: Most induction algorithms for building predictive models take as input training data in the form of feature vectors. Acquiring values features may be costly, and simply acquiring all wasteful or prohibitively expensive. Active feature-value acquisition (AFA) selects incrementally an attempt to improve model most cost-effectively. This paper presents a framework AFA based on estimating information value. Although straightforward principle, estimations approximations must made apply practice. We present policy, sampled expected utility (SEU), that employs particular enable effective ranking potential acquisitions settings where relatively little is available about underlying domain. then experimental results showing that, compared with policy using representative sampling acquisition, SEU reduces cost producing desired accuracy exhibits consistent performance across domains. also extend more general modeling setting which well class labels are missing costly acquire.

参考文章(36)
Lawrence E. Fouraker, Daniel Ellsberg, Sidney Siegel, Bargaining and group decision making ,(1960)
Jeffrey C. Schlimmer, Ming Tan, Two case studies in cost-sensitive concept acquisition national conference on artificial intelligence. pp. 854- 860 ,(1990)
Nicholas Roy, Andrew McCallum, Toward Optimal Active Learning through Sampling Estimation of Error Reduction international conference on machine learning. pp. 441- 448 ,(2001)
David D. Jensen, Paul R. Cohen, Multiple Comparisons in Induction Algorithms Machine Learning. ,vol. 38, pp. 309- 338 ,(2000) , 10.1023/A:1007631014630
Russell Greiner, Adam J. Grove, Dan Roth, Learning cost-sensitive active classifiers☆☆This extends the short conference paper [19]. Artificial Intelligence. ,vol. 139, pp. 137- 174 ,(2002) , 10.1016/S0004-3702(02)00209-6
Yoav Freund, H. Sebastian Seung, Eli Shamir, Naftali Tishby, Selective Sampling Using the Query by Committee Algorithm Machine Learning. ,vol. 28, pp. 133- 168 ,(1997) , 10.1023/A:1007330508534
Pat Langley, Crafting Papers on Machine Learning international conference on machine learning. pp. 1207- 1216 ,(2000)
Keki B. Irani, Usama M. Fayyad, Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning international joint conference on artificial intelligence. ,vol. 2, pp. 1022- 1027 ,(1993)
George H John, Ron Kohavi, Karl Pfleger, None, Irrelevant Features and the Subset Selection Problem Machine Learning Proceedings 1994. pp. 121- 129 ,(1994) , 10.1016/B978-1-55860-335-6.50023-4
Peter D. Turney, Types of cost in inductive concept learning arXiv: Learning. ,(2000)