Rational and convergent learning in stochastic games

作者: Manuela Veloso , Michael Bowling

DOI:

关键词:

摘要: This paper investigates the problem of policy learning in multiagent environments using stochastic game framework, which we briefly overview. We introduce two properties as desirable for a agent when presence other agents, namely rationality and convergence. examine existing reinforcement algorithms according to these notice that they fail simultaneously meet both criteria. then contribute new algorithm, WoLF hillclimbing, is based on simple principle: “learn quickly while losing, slowly winning.” The algorithm proven be rational present empirical results number games showing converges.

参考文章(18)
Michael P. Wellman, Junling Hu, Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm international conference on machine learning. pp. 242- 250 ,(1998)
Damien Ernst, Arthur Louette, Introduction to Reinforcement Learning MIT Press. ,(1998)
Manuela M. Veloso, Michael H. Bowling, Convergence of Gradient Dynamics with a Variable Learning Rate international conference on machine learning. pp. 27- 34 ,(2001)
Michael L. Littman, Markov games as a framework for multi-agent reinforcement learning Machine Learning Proceedings 1994. pp. 157- 163 ,(1994) , 10.1016/B978-1-55860-335-6.50027-1
M Kearns, Y Mansour, S Singh, Nash convergence of gradient dynamics in general-sum games uncertainty in artificial intelligence. pp. 541- 548 ,(2000)
A. M. Fink, Equilibrium in a stochastic $n$-person game Journal of Science of the Hiroshima University, Series A-I (Mathematics). ,vol. 28, pp. 89- 93 ,(1964) , 10.32917/HMJ/1206139508
SANDIP SEN, Evolution and learning in multiagent systems International Journal of Human-computer Studies \/ International Journal of Man-machine Studies. ,vol. 48, pp. 1- 7 ,(1998) , 10.1006/IJHC.1997.0157
Avrim Blum, Carl Burch, On-line learning and the metrical task system problem Proceedings of the tenth annual conference on Computational learning theory - COLT '97. pp. 45- 53 ,(1997) , 10.1145/267460.267475
J. F. Nash, Equilibrium points in n-person games Proceedings of the National Academy of Sciences. ,vol. 36, pp. 48- 49 ,(1950) , 10.1073/PNAS.36.1.48
Craig Boutilier, Caroline Claus, The dynamics of reinforcement learning in cooperative multiagent systems national conference on artificial intelligence. pp. 746- 752 ,(1998)