Selection of Relevant Features in Machine Learning

作者: Pat Langley

DOI: 10.21236/ADA292575

关键词: sortFeature selectionComputer scienceHeuristicMachine learningk-nearest neighbors algorithmWinnowSample complexityArtificial intelligence

摘要: In this paper, we review the problem of selecting rele- vant features for use in machine learning. We describe terms heuristic search through a space feature sets, and identify four dimensions along which approaches to can vary. consider recent work on selection framework, then close with some challenges future area. 1. The Problem Irrelevant Features accuracy) grow slowly number irrele- attributes. Theoretical results algorithms that restricted hypothesis spaces are encouraging. For instance, worst-case errors made by Littlestone's (1987) WINNOW method grows only logarithmically irrelevant features. Pazzani Sarrett's (1992) average-case analysis WHOLIST, simple conjunctive algorithm, Lang- ley Iba's (1993) treatment naive Bayesian classifier, suggest their sample complexities at most linearly However, theoretical less optimistic induction methods larger concept descriptions. example, Langley nearest neighbor indicates its complexity exponen- tially attributes, even target concepts. Experimental stud- ies consistent conclu- sion, other experiments similar hold explicitly se- lect decision-tree appears irrelevants concepts, but exponentially parity since evaluation metric cannot distinguish relevant from fea- tures latter situation (Langley & Sage, press). Results sort have encouraged learn- ing researchers explore more sophisticated sections fol- low, present general framework task, examples important problem.

参考文章(18)
Wayne Iba, Pat Langley, Average-case analysis of a nearest neighbor algorthim international joint conference on artificial intelligence. pp. 889- 894 ,(1993)
Thomas G. Dietterich, Hussein Almuallim, Learning with many irrelevant features national conference on artificial intelligence. pp. 547- 552 ,(1991)
Andrew W. Moore, Mary S. Lee, Efficient Algorithms for Minimizing Cross Validation Error Machine Learning Proceedings 1994. pp. 190- 198 ,(1994) , 10.1016/B978-1-55860-335-6.50031-3
Rich Caruana, Dayne Freitag, Greedy Attribute Selection Machine Learning Proceedings 1994. pp. 28- 36 ,(1994) , 10.1016/B978-1-55860-335-6.50012-X
Stephanie Sage, Pat Langley, Oblivious Decision Trees and Abstract Cases ,(1994)
Miroslav Kubat, Doris Flotzinger, Gert Pfurtscheller, Discovering Patterns in EEG-Signals: Comparative Study of a Few Methods european conference on machine learning. pp. 366- 371 ,(1993) , 10.1007/3-540-56602-3_152
Kenji Kira, Larry A. Rendell, A Practical Approach to Feature Selection international conference on machine learning. pp. 249- 256 ,(1992) , 10.1016/B978-1-55860-247-2.50037-1
Josef Kittler, Pierre A. Devijver, Pattern recognition : a statistical approach Prentice/Hall International. ,(1982)