The Nonstochastic Multiarmed Bandit Problem

作者: Peter Auer , Nicolò Cesa-Bianchi , Yoav Freund , Robert E. Schapire

DOI: 10.1137/S0097539701398375

关键词:

摘要: … Our player algorithms are based in part on an algorithm presented by Freund and Schapire [6, … In the setting analyzed by Freund and Schapire, the player scores on each pull the reward …

参考文章(16)
T. Ishikida, P. Varaiya, Multi-armed bandit problem revisited Journal of Optimization Theory and Applications. ,vol. 83, pp. 113- 154 ,(1994) , 10.1007/BF02191765
Sergiu Hart, Andreu Mas-Colell, A General Class of Adaptive Strategies Journal of Economic Theory. ,vol. 98, pp. 26- 54 ,(2001) , 10.1006/JETH.2000.2746
Nicolò Cesa-Bianchi, Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire, Manfred K. Warmuth, How to use expert advice Journal of the ACM. ,vol. 44, pp. 427- 485 ,(1997) , 10.1145/258128.258179
Alfredo Banos, On Pseudo-Games Annals of Mathematical Statistics. ,vol. 39, pp. 1932- 1945 ,(1968) , 10.1214/AOMS/1177698023
Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504
Herbert Robbins, Some aspects of the sequential design of experiments Bulletin of the American Mathematical Society. ,vol. 58, pp. 527- 535 ,(1952) , 10.1090/S0002-9904-1952-09620-8
T.L Lai, Herbert Robbins, Asymptotically efficient adaptive allocation rules Advances in Applied Mathematics. ,vol. 6, pp. 4- 22 ,(1985) , 10.1016/0196-8858(85)90002-8
N. Littlestone, M.K. Warmuth, The weighted majority algorithm Information & Computation. ,vol. 108, pp. 212- 261 ,(1994) , 10.1006/INCO.1994.1009
Thomas M. Cover, Joy A. Thomas, Elements of information theory ,(1991)
Yoav Freund, Robert E. Schapire, Adaptive game playing using multiplicative weights Games and Economic Behavior. ,vol. 29, pp. 79- 103 ,(1999) , 10.1006/GAME.1999.0738