Scalable Thompson Sampling via Optimal Transport

作者: Ruiyi Zhang , Zheng Wen , Changyou Chen , Chen Fang , Tong Yu

DOI:

关键词:

摘要: Thompson sampling (TS) is a class of algorithms for sequential decision-making, which requires maintaining a posterior distribution over a model. However, calculating exact …

参考文章(23)
Shipra Agrawal, Navin Goyal, Thompson Sampling for Contextual Bandits with Linear Payoffs international conference on machine learning. pp. 127- 135 ,(2013)
Ian Osband, Alexander Pritzel, Benjamin Van Roy, Charles Blundell, Deep Exploration via Bootstrapped DQN arXiv: Learning. ,(2016)
Dilin Wang, Qiang Liu, Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm arXiv: Machine Learning. ,(2016)
Emma Brunskill, Animashree Anandkumar, Kamyar Azizzadenesheli, Efficient Exploration through Bayesian Deep Q-Networks arXiv: Artificial Intelligence. ,(2018)
Lawrence Carin, Changyou Chen, Chunyuan Li, Ruiyi Zhang, Policy optimization as wasserstein gradient flows international conference on machine learning. pp. 5737- 5746 ,(2018)
Yasin Abbasi-Yadkori, Sharan Vaswani, Zheng Wen, Mark Schmidt, Anup Rao, Branislav Kveton, New Insights into Bootstrapping for Bandits. arXiv: Learning. ,(2018)
Lawrence Carin, Changyou Chen, Ruiyi Zhang, Jianyi Zhang, Stochastic Particle-Optimization Sampling and the Non-Asymptotic Convergence Theory arXiv: Machine Learning. ,(2018)
Daniel Russo, Benjamin Van Roy, An information-theoretic analysis of Thompson sampling Journal of Machine Learning Research. ,vol. 17, pp. 2442- 2471 ,(2016) , 10.5555/2946645.3007021
Ian Osband, Abbas Kazerouni, Daniel J. Russo, Benjamin Van Roy, Zheng Wen, A Tutorial on Thompson Sampling ,(2018)
Lawrence Carin, Changyou Chen, Chunyuan Li, Ruiyi Zhang, Learning Structural Weight Uncertainty for Sequential Decision-Making international conference on artificial intelligence and statistics. pp. 1137- 1146 ,(2018)