Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems

DOI: 10.1016/J.AMC.2007.07.043

关键词: Genetic algorithm 、 Softmax function 、 Thompson sampling 、 Artificial intelligence 、 Action selection 、 Mathematics 、 Reinforcement learning 、 Multi-armed bandit 、 Probability matching 、 Evolutionary algorithm

摘要: … itself in the face of evolutionary algorithms. We present an evolutionary algorithm that was implemented to solve the non-stationary bandit problem along with ad hoc solution algorithms, …

uni-trier.de 本地加速

doi.org 本地加速

sciencedirect.com 本地加速

elsevier.com 本地加速

doi.org LINK 下载加速

sci-hub.se PDF 下载加速

参考文章(18)

PS Sastry, Mal Thathachar, A Class of Rapidly Converging Algorithms for Learning Automata IEEE. ,(1984)

Dmitriy V. Chulkov, Mayur S. Desai, Information technology project failures: Applying the bandit problem to evaluate managerial decision making Information Management & Computer Security. ,vol. 13, pp. 135- 143 ,(2005) , 10.1108/09685220510589316

Benoît Leloup, Laurent Deveaux, Dynamic Pricing on the Internet: Theory and Simulations Electronic Commerce Research. ,vol. 1, pp. 265- 276 ,(2001) , 10.1023/A:1011546021787

P. Varaiya, J. Walrand, C. Buyukkoc, Extensions of the multiarmed bandit problem: The discounted case IEEE Transactions on Automatic Control. ,vol. 30, pp. 426- 439 ,(1985) , 10.1109/TAC.1985.1103989

Rina Azoulay-Schwartz, Sarit Kraus, Jonathan Wilkenfeld, Exploitation vs. exploration: choosing a supplier in an environment of incomplete information decision support systems. ,vol. 38, pp. 1- 18 ,(2004) , 10.1016/S0167-9236(03)00061-7

Jeffrey S. Banks, Rangarajan K. Sundaram, Switching Costs and the Gittins Index Econometrica. ,vol. 62, pp. 687- 694 ,(1994) , 10.2307/2951664

Irene Valsecchi, Job assignment and bandit problems International Journal of Manpower. ,vol. 24, pp. 844- 866 ,(2003) , 10.1108/01437720310502168

Brian P. McCall, John J. McCall, Systematic search, belated information, and the gittins' index Economics Letters. ,vol. 8, pp. 327- 333 ,(1981) , 10.1016/0165-1765(81)90021-5

Donald A. Berry, Bert Fristedt, Bandit problems: Sequential Allocation of Experiments ,(1984)

10.

DAVID B. FOGEL, HANS-GEORG BEYER, Do evolutionary processes minimize expected losses Journal of Theoretical Biology. ,vol. 207, pp. 117- 123 ,(2000) , 10.1006/JTBI.2000.2166

Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems

来源期刊

我的账户

Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems

来源期刊

相似文章 10

我的账户