Marginalizing Corrupted Features

作者: Laurens van der Maaten , Kilian Q. Weinberger , Minmin Chen , Stephen Tyree

DOI:

关键词:

摘要: The goal of machine learning is to develop predictors that generalize well test data. Ideally, this achieved by training on an almost infinitely large data set captures all variations in the distribution. In practical settings, however, we do not have infinite and our may overfit. Overfitting be combatted, for example, adding a regularizer objective or defining prior over model parameters performing Bayesian inference. paper, propose third, alternative approach combat overfitting: extend with many artificial examples are obtained corrupting original We show efficient range corruption models. Our approach, called marginalized corrupted features (MCF), trains robust minimizing expected value loss function under model. empirically variety sets MCF classifiers can trained efficiently, substantially better data, also more feature deletion at time.

参考文章(46)
Yoshua Bengio, Xavier Glorot, Antoine Bordes, Antoine Bordes, Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach international conference on machine learning. pp. 513- 520 ,(2011)
Kilian Weinberger, Minmin Chen, Alice Zheng, Fast Image Tagging international conference on machine learning. pp. 1274- 1282 ,(2013)
Paul Lamere, Guillaume Desjardins, François Maillet, Douglas Eck, STEERABLE PLAYLIST GENERATION BY LEARNING SONG SIMILARITY FROM RADIO STATION PLAYLISTS international symposium/conference on music information retrieval. pp. 345- 350 ,(2009)
Shai Shalev-Shwartz, Nicolò Cesa-Bianchi, Ohad Shamir, Online Learning of Noisy Data with Kernels arXiv: Learning. ,(2010)
Bernhard Schölkopf, Neil D. Lawrence, Estimating a Kernel Fisher Discriminant in the Presence of Label Noise international conference on machine learning. pp. 306- 313 ,(2001)
David A. McAllester, A PAC-Bayesian Tutorial with A Dropout Bound arXiv: Learning. ,(2013)
Ilya Sutskever, Geoffrey E. Hinton, Alex Krizhevsky, Ruslan R. Salakhutdinov, Nitish Srivastava, Improving neural networks by preventing co-adaptation of feature detectors arXiv: Neural and Evolutionary Computing. ,(2012)
Trevor Hastie, Saharon Rosset, Ji Zhu, Hui Zou, Multi-class AdaBoost ∗ Statistics and Its Interface. ,vol. 2, pp. 349- 360 ,(2009) , 10.4310/SII.2009.V2.N3.A8
Marc'Aurelio Ranzato, Geoffrey E. Hinton, Modeling pixel means and covariances using factorized third-order boltzmann machines computer vision and pattern recognition. pp. 2551- 2558 ,(2010) , 10.1109/CVPR.2010.5539962
Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504