作者: Hussein Almuallim , Thomas G. Dietterich
DOI: 10.1016/0004-3702(94)90084-1
关键词:
摘要: Abstract In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features possible. This paper defines and studies this in Boolean domains. First, it shown that any learning algorithm implementing requires ⊖(( ln ( l δ ) + [2 p n])/e) training examples to guarantee PAC-learning a concept having relevant out of n available features. bound only logarithmic number irrelevant For presents five algorithms identify subset sufficient construct hypothesis with examples. FOCUS-1 straightforward returns minimal quasi-polynomial time. FOCUS-2 does same task but empirically be substantially faster than FOCUS-1. Finally, Simple-Greedy, Mutual-Information-Greedy Weighted-Greedy are three greedy heuristics trade optimality for computational efficiency. Experimental presented compare these exact approximate two well-known algorithms, ID3 FRINGE, situations where present. These experiments show that—contrary expectations—the FRINGE do not implement good approximations MIN-FEATURES. The sample complexity generalization performance FOCUS better either or on problems appropriate. also that, among our heuristics, provides excellent approximation algorithms.