Pure exploration in multi-armed bandits problems

作者： Sébastien Bubeck , Rémi Munos , Gilles Stoltz

关键词:

摘要: … We consider the framework of stochastic multi-armed bandit problems and study the possibilities and limitations of strategies that perform an online exploration of the arms. The …

参考文章(22)

Guido Sanguinetti, Neil D. Lawrence, Missing Data in Kernel PCA Lecture Notes in Computer Science. pp. 751- 758 ,(2006) , 10.1007/11871842_76

Karl H. Schlag, ELEVEN - Tests Needed for a Recommendation Research Papers in Economics. ,(2006)

Omid Madani, Daniel J. Lizotte, Russell Greiner, The Budgeted Multi-armed Bandit Problem Learning Theory. pp. 643- 645 ,(2004) , 10.1007/978-3-540-27819-1_46

Luc Devroye, Gábor Lugosi, Combinatorial Methods in Density Estimation ,(2011)

Rémi Munos, Sébastien Bubeck, Gilles Stoltz, Pure Exploration for Multi-Armed Bandit Problems arXiv: Statistics Theory. ,(2008)

Levente Kocsis, Csaba Szepesvári, Bandit Based Monte-Carlo Planning Lecture Notes in Computer Science. pp. 282- 293 ,(2006) , 10.1007/11871842_29

Eyal Even-Dar, Shie Mannor, Yishay Mansour, PAC Bounds for Multi-armed Bandit and Markov Decision Processes conference on learning theory. pp. 255- 270 ,(2002) , 10.1007/3-540-45435-7_18

Olivier Teytaud, Rémi Munos, Sylvain Gelly, Yizao Wang, Modiﬁcation of UCT with Patterns in Monte-Carlo Go INRIA. ,(2006)

Herbert Robbins, Some aspects of the sequential design of experiments Bulletin of the American Mathematical Society. ,vol. 58, pp. 527- 535 ,(1952) , 10.1090/S0002-9904-1952-09620-8

10.

T.L Lai, Herbert Robbins, Asymptotically efficient adaptive allocation rules Advances in Applied Mathematics. ,vol. 6, pp. 4- 22 ,(1985) , 10.1016/0196-8858(85)90002-8

Pure exploration in multi-armed bandits problems

来源期刊

我的账户

Pure exploration in multi-armed bandits problems

来源期刊

相似文章 10

我的账户