Learning Boolean concepts in the presence of many irrelevant features

作者: Hussein Almuallim , Thomas G. Dietterich

DOI: 10.1016/0004-3702(94)90084-1

关键词:

摘要: Abstract In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features possible. This paper defines and studies this in Boolean domains. First, it shown that any learning algorithm implementing requires ⊖(( ln ( l δ ) + [2 p n])/e) training examples to guarantee PAC-learning a concept having relevant out of n available features. bound only logarithmic number irrelevant For presents five algorithms identify subset sufficient construct hypothesis with examples. FOCUS-1 straightforward returns minimal quasi-polynomial time. FOCUS-2 does same task but empirically be substantially faster than FOCUS-1. Finally, Simple-Greedy, Mutual-Information-Greedy Weighted-Greedy are three greedy heuristics trade optimality for computational efficiency. Experimental presented compare these exact approximate two well-known algorithms, ID3 FRINGE, situations where present. These experiments show that—contrary expectations—the FRINGE do not implement good approximations MIN-FEATURES. The sample complexity generalization performance FOCUS better either or on problems appropriate. also that, among our heuristics, provides excellent approximation algorithms.

参考文章(17)
Thomas G. Dietterich, Hussein Almuallim, Learning with many irrelevant features national conference on artificial intelligence. pp. 547- 552 ,(1991)
Thomas G. Dietterich, Hussein Almuallim, Efficient Algorithms for Identifying Relevant Features Oregon State University. ,(1992)
Wray Buntine, Myths and legends in learning classification rules national conference on artificial intelligence. pp. 736- 742 ,(1990)
Narendra, Fukunaga, A Branch and Bound Algorithm for Feature Subset Selection IEEE Transactions on Computers. ,vol. 26, pp. 917- 922 ,(1977) , 10.1109/TC.1977.1674939
A. Ehrenfeucht, Leslie Valiant, David Haussler, Michael Kearns, A general lower bound on the number of examples needed for learning conference on learning theory. pp. 139- 154 ,(1988) , 10.5555/93025.93068
Tom M Mitchell, None, Generalization as search Artificial Intelligence. ,vol. 18, pp. 203- 226 ,(1982) , 10.1016/0004-3702(82)90040-6
Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, Manfred K. Warmuth, Occam's razor Information Processing Letters. ,vol. 24, pp. 377- 380 ,(1987) , 10.1016/0020-0190(87)90114-1
Michael Kearns, Ming Li, Learning in the presence of malicious errors symposium on the theory of computing. pp. 267- 280 ,(1988) , 10.1145/62212.62238