Learning Boolean concepts in the presence of many irrelevant features

作者： Hussein Almuallim , Thomas G. Dietterich

关键词:

摘要: Abstract In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features possible. This paper defines and studies this in Boolean domains. First, it shown that any learning algorithm implementing requires ⊖(( ln ( l δ ) + [2 p n])/e) training examples to guarantee PAC-learning a concept having relevant out of n available features. bound only logarithmic number irrelevant For presents five algorithms identify subset sufficient construct hypothesis with examples. FOCUS-1 straightforward returns minimal quasi-polynomial time. FOCUS-2 does same task but empirically be substantially faster than FOCUS-1. Finally, Simple-Greedy, Mutual-Information-Greedy Weighted-Greedy are three greedy heuristics trade optimality for computational efficiency. Experimental presented compare these exact approximate two well-known algorithms, ID3 FRINGE, situations where present. These experiments show that—contrary expectations—the FRINGE do not implement good approximations MIN-FEATURES. The sample complexity generalization performance FOCUS better either or on problems appropriate. also that, among our heuristics, provides excellent approximation algorithms.

sciencedirect.com 本地加速

参考文章(17)

Thomas G. Dietterich, Hussein Almuallim, Learning with many irrelevant features national conference on artificial intelligence. pp. 547- 552 ,(1991)

Salvatore D. Morgera, COMPUTATIONAL COMPLEXITY AND VLSI IMPLEMENTATION OF AN OPTIMAL FEATURE SELECTION STRATEGY Pattern Recognition in Practice. pp. 389- 400 ,(1986) , 10.1016/B978-0-444-87877-9.50036-4

Thomas G. Dietterich, Hussein Almuallim, Efficient Algorithms for Identifying Relevant Features Oregon State University. ,(1992)

Hussein Saleh Almuallim, Concept coverage and its application to two learning tasks ,(1992)

Wray Buntine, Myths and legends in learning classification rules national conference on artificial intelligence. pp. 736- 742 ,(1990)

Narendra, Fukunaga, A Branch and Bound Algorithm for Feature Subset Selection IEEE Transactions on Computers. ,vol. 26, pp. 917- 922 ,(1977) , 10.1109/TC.1977.1674939

A. Ehrenfeucht, Leslie Valiant, David Haussler, Michael Kearns, A general lower bound on the number of examples needed for learning conference on learning theory. pp. 139- 154 ,(1988) , 10.5555/93025.93068

Tom M Mitchell, None, Generalization as search Artificial Intelligence. ,vol. 18, pp. 203- 226 ,(1982) , 10.1016/0004-3702(82)90040-6

Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, Manfred K. Warmuth, Occam's razor Information Processing Letters. ,vol. 24, pp. 377- 380 ,(1987) , 10.1016/0020-0190(87)90114-1

10.

Michael Kearns, Ming Li, Learning in the presence of malicious errors symposium on the theory of computing. pp. 267- 280 ,(1988) , 10.1145/62212.62238

Learning Boolean concepts in the presence of many irrelevant features

来源期刊

我的账户

Learning Boolean concepts in the presence of many irrelevant features

来源期刊

相似文章 10

我的账户