作者: Anoop Korattikara , Max Welling , Sungjin Ahn
DOI:
关键词:
摘要: In this paper we address the following question: "Can approximately sample from a Bayesian posterior distribution if are only allowed to touch small mini-batch of data-items for every generate?". An algorithm based on Langevin equation with stochastic gradients (SGLD) was previously proposed solve this, but its mixing rate slow. By leveraging Central Limit Theorem, extend SGLD so that at high rates it will normal approximation posterior, while slow mimic behavior pre-conditioner matrix. As bonus, is reminiscent Fisher scoring (with gradients) and as such an efficient optimizer during burn-in.