Selection of Relevant Features in Machine Learning

关键词: sort 、 Feature selection 、 Computer science 、 Heuristic 、 Machine learning 、 k-nearest neighbors algorithm 、 Winnow 、 Sample complexity 、 Artificial intelligence

摘要: In this paper, we review the problem of selecting rele- vant features for use in machine learning. We describe terms heuristic search through a space feature sets, and identify four dimensions along which approaches to can vary. consider recent work on selection framework, then close with some challenges future area. 1. The Problem Irrelevant Features accuracy) grow slowly number irrele- attributes. Theoretical results algorithms that restricted hypothesis spaces are encouraging. For instance, worst-case errors made by Littlestone's (1987) WINNOW method grows only logarithmically irrelevant features. Pazzani Sarrett's (1992) average-case analysis WHOLIST, simple conjunctive algorithm, Lang- ley Iba's (1993) treatment naive Bayesian classifier, suggest their sample complexities at most linearly However, theoretical less optimistic induction methods larger concept descriptions. example, Langley nearest neighbor indicates its complexity exponen- tially attributes, even target concepts. Experimental stud- ies consistent conclu- sion, other experiments similar hold explicitly se- lect decision-tree appears irrelevants concepts, but exponentially parity since evaluation metric cannot distinguish relevant from fea- tures latter situation (Langley & Sage, press). Results sort have encouraged learn- ing researchers explore more sophisticated sections fol- low, present general framework task, examples important problem.

dtic.mil 本地加速

参考文章(18)

Wayne Iba, Pat Langley, Average-case analysis of a nearest neighbor algorthim international joint conference on artificial intelligence. pp. 889- 894 ,(1993)

Thomas G. Dietterich, Hussein Almuallim, Learning with many irrelevant features national conference on artificial intelligence. pp. 547- 552 ,(1991)

Andrew W. Moore, Mary S. Lee, Efficient Algorithms for Minimizing Cross Validation Error Machine Learning Proceedings 1994. pp. 190- 198 ,(1994) , 10.1016/B978-1-55860-335-6.50031-3

Jeffrey C. Schlimmer, Efficiently Inducing Determinations: A Complete and Systematic Search Algorithm that Uses Optimal Pruning international conference on machine learning. pp. 284- 290 ,(1993) , 10.1016/B978-1-55860-307-3.50043-5

Rich Caruana, Dayne Freitag, Greedy Attribute Selection Machine Learning Proceedings 1994. pp. 28- 36 ,(1994) , 10.1016/B978-1-55860-335-6.50012-X

Stephanie Sage, Pat Langley, Oblivious Decision Trees and Abstract Cases ,(1994)

David B. Skalak, Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms Machine Learning Proceedings 1994. pp. 293- 301 ,(1994) , 10.1016/B978-1-55860-335-6.50043-X

Miroslav Kubat, Doris Flotzinger, Gert Pfurtscheller, Discovering Patterns in EEG-Signals: Comparative Study of a Few Methods european conference on machine learning. pp. 366- 371 ,(1993) , 10.1007/3-540-56602-3_152

Kenji Kira, Larry A. Rendell, A Practical Approach to Feature Selection international conference on machine learning. pp. 249- 256 ,(1992) , 10.1016/B978-1-55860-247-2.50037-1

10.

Josef Kittler, Pierre A. Devijver, Pattern recognition : a statistical approach Prentice/Hall International. ,(1982)

Selection of Relevant Features in Machine Learning

来源期刊

我的账户

Selection of Relevant Features in Machine Learning

来源期刊

相似文章 10

我的账户