BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees

作者: Yongjoo Park , Jingyi Qing , Xiaoyang Shen , Barzan Mozafari

DOI: 10.1145/3299869.3300077

关键词: Poisson regressionProbabilistic logicSpeedupSampling (statistics)Entropy (energy dispersal)Entropy (arrow of time)Entropy (information theory)Linear regressionLogistic regressionComputer scienceAlgorithmEntropy (statistical thermodynamics)Generalized linear modelMaximum likelihoodEntropy (classical thermodynamics)Entropy (order and disorder)Hyperparameter

摘要: The rising volume of datasets has made training machine learning (ML) models a major computational cost in the enterprise. Given iterative nature model and parameter tuning, many analysts use small sample their entire data during initial stage analysis to make quick decisions (e.g., what features or hyperparameters use) dataset only later stages (i.e., when they have converged specific model). This sampling, however, is performed an ad-hoc fashion. Most practitioners cannot precisely capture effect sampling on quality model, eventually decision-making process tuning phase. Moreover, without systematic support for operators, optimizations reuse opportunities are lost. In this paper, we introduce BlinkML, system fast, quality-guaranteed ML training. BlinkML allows users error-computation tradeoffs: instead full model), can quickly train approximate with guarantees using sample. ensure that, high probability, makes same predictions as model. currently supports any that relies maximum likelihood estimation (MLE), which includes Generalized Linear Models linear regression, logistic max entropy classifier, Poisson regression) well PPCA (Probabilistic Principal Component Analysis). Our experiments show speed up large-scale tasks by 6.26x-629x while guaranteeing predictions, 95%

参考文章(102)
Matthew D. Zeiler, ADADELTA: An Adaptive Learning Rate Method arXiv: Learning. ,(2012)
Bal zs K gl, R mi Bardenet, R mi Bardenet, M ty s Brendel, Mich le Sebag, Collaborative hyperparameter tuning international conference on machine learning. ,vol. 28, pp. 199- 207 ,(2013)
Allen Y. Yang, Yi Ma, S. Shankar Sastry, John Wright, Feature Selection in Face Recognition: A Sparse Representation Perspective ,(2007)
Kohei Ogawa, Yoshiki Suzuki, Ichiro Takeuchi, Safe Screening of Non-Support Vectors in Pathwise SVM Computation international conference on machine learning. pp. 1382- 1390 ,(2013)
Hugo Larochelle, Ruslan Salakhutdinov, Efficient Learning of Deep Boltzmann Machines international conference on artificial intelligence and statistics. pp. 693- 700 ,(2010)
James Martens, Deep learning via Hessian-free optimization international conference on machine learning. pp. 735- 742 ,(2010)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
S. Sathiya Keerthi, Chih-Jen Lin, Ruby C. Weng, Trust Region Newton Method for Logistic Regression Journal of Machine Learning Research. ,vol. 9, pp. 627- 650 ,(2008) , 10.1145/1390681.1390703
Christopher M. Bishop, Pattern Recognition and Machine Learning ,(2006)
Ali Ghodsi, Peter Bailis, Joseph M. Hellerstein, Ion Stoica, Joseph E. Gonzalez, Michael I. Jordan, Michael J. Franklin, Asynchronous Complex Analytics in a Distributed Dataflow Architecture arXiv: Databases. ,(2015)