Markov Chain Monte Carlo and Variational Inference: Bridging the Gap

作者: Tim Salimans , Max Welling , Diederik Kingma

DOI:

关键词: Applied mathematicsMarkov chain Monte CarloMonte Carlo methodComputationMaximizationVariational message passingBayesian inferenceMathematical optimizationMathematicsRandom variableInference

摘要: Recent advances in stochastic gradient variational inference have made it possible to perform Bayesian with posterior approximations containing auxiliary random variables. This enables us explore a new synthesis of and Monte Carlo methods where we incorporate one or more steps MCMC into our approximation. By doing so obtain rich class algorithms bridging the gap between MCMC, offering best both worlds: fast approximation through maximization an explicit objective, option trading off additional computation for accuracy. We describe theoretical foundations that make this show some promising first results.

参考文章(18)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
Jim Albert, Bayesian Computation with R ,(2009)
Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian Goodfellow, Arnaud Bergeron, Nicolas Bouchard, David Warde-Farley, Yoshua Bengio, None, Theano: new features and speed improvements arXiv: Symbolic Computation. ,(2012)
Radford M. Neal, MCMC Using Hamiltonian Dynamics arXiv: Computation. pp. 139- 188 ,(2011) , 10.1201/B10905-10
Ivo Danihelka, Daan Wierstra, Alex Graves, Danilo Jimenez Rezende, Karol Gregor, DRAW: A Recurrent Neural Network For Image Generation arXiv: Computer Vision and Pattern Recognition. ,(2015)
Alexey Dosovitskiy, Jost Tobias Springenberg, Thomas Brox, None, Learning to generate chairs with convolutional neural networks computer vision and pattern recognition. pp. 1538- 1546 ,(2015) , 10.1109/CVPR.2015.7298761
Max Welling, Diederik P Kingma, Auto-Encoding Variational Bayes international conference on learning representations. ,(2014)
Geoffrey E. Hinton, Richard S. Zemel, Autoencoders, Minimum Description Length and Helmholtz Free Energy neural information processing systems. ,vol. 6, pp. 3- 10 ,(1993)
Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition Proceedings of the IEEE. ,vol. 86, pp. 2278- 2324 ,(1998) , 10.1109/5.726791