The Tradeoffs of Large Scale Learning

作者: Léon Bottou , Olivier Bousquet

DOI:

关键词: Generalization errorTheoretical computer scienceComputational learning theoryAlgorithmic learning theoryActive learning (machine learning)Mathematical optimizationInstance-based learningProactive learningEmpirical risk minimizationComputer scienceStability (learning theory)Sample exclusion dimensionOnline machine learning

摘要: This contribution develops a theoretical framework that takes into account the effect of approximate optimization on learning algorithms. The analysis shows distinct tradeoffs for case small-scale and large-scale problems. Small-scale problems are subject to usual approximation-estimation tradeoff. Large-scale qualitatively different tradeoff involving computational complexity underlying algorithms in non-trivial ways.

参考文章(28)
O Bousquet, Concentration Inequalities and Empirical Processes Theory Applied to the Analysis of Learning Algorithms École Polytechnique: Department of Applied Mathematics Paris, France. ,(2002)
Vladimir Naumovich Vapnik, Estimation of Dependences Based on Empirical Data ,(2010)
Ingo Steinwart, Clint Scovel, Fast rates for support vector machines conference on learning theory. pp. 279- 294 ,(2005) , 10.1007/11503415_19
Peter L Bartlett, Michael I Jordan, Jon D McAuliffe, None, Convexity, Classification, and Risk Bounds Journal of the American Statistical Association. ,vol. 101, pp. 138- 156 ,(2006) , 10.1198/016214505000000907
Vladimir Vapnik, Esther Levin, Yann Le Cun, Measuring the VC-dimension of a learning machine Neural Computation. ,vol. 6, pp. 851- 876 ,(1994) , 10.1162/NECO.1994.6.5.851
Stephen Judd, On the complexity of loading shallow neural networks Journal of Complexity. ,vol. 4, pp. 177- 192 ,(1988) , 10.1016/0885-064X(88)90019-2
Stéphane Boucheron, Olivier Bousquet, Gábor Lugosi, Theory of classification : a survey of some recent advances Esaim: Probability and Statistics. ,vol. 9, pp. 323- 375 ,(2005) , 10.1051/PS:2005018
L. G. Valiant, A theory of the learnable symposium on the theory of computing. ,vol. 27, pp. 1134- 1142 ,(1984) , 10.1145/800057.808710