作者: Siddharth Sabharwal , Jayadeva , Sanjit S. Batra
DOI:
关键词:
摘要: Feature selection involes identifying the most relevant subset of input features, with a view to improving generalization predictive models by reducing overfitting. Directly searching for combination attributes is NP-hard. Variable critical importance in many applications, such as micro-array data analysis, where selecting small number discriminative features crucial developing useful disease mechanisms, well prioritizing targets drug discovery. The recently proposed Minimal Complexity Machine (MCM) provides way learn hyperplane classifier minimizing an exact (\boldmath{$\Theta$}) bound on its VC dimension. It known that lower dimension contributes good generalization. For linear space, upper bounded features; hence, parsimonious set it employs. In this paper, we use MCM which large weights are zero; non-zero ones chosen. Selected used kernel SVM classifier. On benchmark datasets, chosen yield comparable or better test accuracy than when methods ReliefF and FCBF task. typically chooses one-tenth other methods; some very high dimensional about $0.6\%$ comparison, choose 70 140 times more thus demonstrating may provide new, effective route feature learning sparse representations.