Dropout Training as Adaptive Regularization

作者: Stefan Wager , Sida Wang , Percy S Liang

DOI:

关键词: MathematicsRegularization (mathematics)OverfittingGeneralized linear modelFisher informationDiagonalDocument classificationMachine learningScalingInverseArtificial intelligence

摘要: … , dropout performs a form of adaptive regularization. Using this viewpoint, we show that the dropout … operates by repeatedly solving linear dropout-regularized problems. By casting …

参考文章(23)
Sida Wang, Christopher Manning, Fast dropout training international conference on machine learning. pp. 118- 126 ,(2013)
Hideki Isozaki, Jun Suzuki, Akinori Fujino, Semi-Supervised Structured Output Learning Based on a Hybrid Generative and Discriminative Approach empirical methods in natural language processing. pp. 791- 800 ,(2007)
Ilya Sutskever, Geoffrey E. Hinton, Alex Krizhevsky, Ruslan R. Salakhutdinov, Nitish Srivastava, Improving neural networks by preventing co-adaptation of feature detectors arXiv: Neural and Evolutionary Computing. ,(2012)
Koby Crammer, Alex Kulesza, Mark Dredze, Adaptive Regularization of Weight Vectors neural information processing systems. ,vol. 91, pp. 414- 422 ,(2009) , 10.1007/S10994-013-5327-X
Yaser S Abu-Mostafa, Learning from hints in neural networks Journal of Complexity. ,vol. 6, pp. 192- 198 ,(1990) , 10.1016/0885-064X(90)90006-Y
K. Matsuoka, Noise injection into inputs in back-propagation learning systems man and cybernetics. ,vol. 22, pp. 436- 440 ,(1992) , 10.1109/21.155944
Kamal Nigam, Andrew Kachites McCallum, Sebastian Thrun, Tom Mitchell, Text Classification from Labeled and Unlabeled Documents using EM Machine Learning. ,vol. 39, pp. 103- 134 ,(2000) , 10.1023/A:1007692713085
Jerome Friedman, Trevor Hastie, Robert Tibshirani, Regularization Paths for Generalized Linear Models via Coordinate Descent Journal of Statistical Software. ,vol. 33, pp. 1- 22 ,(2010) , 10.18637/JSS.V033.I01
Yoshua Bengio, Xavier Glorot, Salah Rifai, Pascal Vincent, Adding noise to the input of a model trained with a regularized objective arXiv: Artificial Intelligence. ,(2011)
Bernhard Schölkopf, Christopher J. C. Burges, Improving the Accuracy and Speed of Support Vector Machines neural information processing systems. ,vol. 9, pp. 375- 381 ,(1996)