Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems

作者: D.E. Koulouriotis , A. Xanthopoulos

DOI: 10.1016/J.AMC.2007.07.043

关键词: Genetic algorithmSoftmax functionThompson samplingArtificial intelligenceAction selectionMathematicsReinforcement learningMulti-armed banditProbability matchingEvolutionary algorithm

摘要: … itself in the face of evolutionary algorithms. We present an evolutionary algorithm that was implemented to solve the non-stationary bandit problem along with ad hoc solution algorithms, …

参考文章(18)
Dmitriy V. Chulkov, Mayur S. Desai, Information technology project failures: Applying the bandit problem to evaluate managerial decision making Information Management & Computer Security. ,vol. 13, pp. 135- 143 ,(2005) , 10.1108/09685220510589316
Benoît Leloup, Laurent Deveaux, Dynamic Pricing on the Internet: Theory and Simulations Electronic Commerce Research. ,vol. 1, pp. 265- 276 ,(2001) , 10.1023/A:1011546021787
P. Varaiya, J. Walrand, C. Buyukkoc, Extensions of the multiarmed bandit problem: The discounted case IEEE Transactions on Automatic Control. ,vol. 30, pp. 426- 439 ,(1985) , 10.1109/TAC.1985.1103989
Rina Azoulay-Schwartz, Sarit Kraus, Jonathan Wilkenfeld, Exploitation vs. exploration: choosing a supplier in an environment of incomplete information decision support systems. ,vol. 38, pp. 1- 18 ,(2004) , 10.1016/S0167-9236(03)00061-7
Jeffrey S. Banks, Rangarajan K. Sundaram, Switching Costs and the Gittins Index Econometrica. ,vol. 62, pp. 687- 694 ,(1994) , 10.2307/2951664
Irene Valsecchi, Job assignment and bandit problems International Journal of Manpower. ,vol. 24, pp. 844- 866 ,(2003) , 10.1108/01437720310502168
Brian P. McCall, John J. McCall, Systematic search, belated information, and the gittins' index Economics Letters. ,vol. 8, pp. 327- 333 ,(1981) , 10.1016/0165-1765(81)90021-5
Donald A. Berry, Bert Fristedt, Bandit problems: Sequential Allocation of Experiments ,(1984)
DAVID B. FOGEL, HANS-GEORG BEYER, Do evolutionary processes minimize expected losses Journal of Theoretical Biology. ,vol. 207, pp. 117- 123 ,(2000) , 10.1006/JTBI.2000.2166