Simultaneous adversarial multi-robot learning

作者: Manuela Veloso , Michael Bowling

DOI:

关键词: Algorithmic learning theoryActive learning (machine learning)Robot learningRobotCompetitive learningArtificial intelligenceSynchronous learningStability (learning theory)Instance-based learningMulti-task learningComputer scienceProactive learningLearning classifier systemReinforcement learning

摘要: Multi-robot learning faces all of the challenges robot with multiagent learning. There has been a great deal recent research on reinforcement in stochastic games, which is intuitive extension MDPs to multiple agents. This work, although general, only applied small games at most hundreds states. On other hand tasks have continuous, and often complex, state action spaces. Robot demand approximation generalization techniques, received extensive attention single-agent In this paper we introduce GraWoLF, general-purpose, scalable, algorithm. It combines gradient-based policy techniques WoLF ("Win or Learn Fast") variable rate. We apply algorithm an adversarial multi-robot task simultaneous show results both simulation real robots. These demonstrate that GraWoLF can learn successful policies, overcoming many

参考文章(14)
Damien Ernst, Arthur Louette, Introduction to Reinforcement Learning MIT Press. ,(1998)
Michael L. Littman, Markov games as a framework for multi-agent reinforcement learning Machine Learning Proceedings 1994. pp. 157- 163 ,(1994) , 10.1016/B978-1-55860-335-6.50027-1
Peter L. Bartlett, Jonathan Baxter, Reinforcement Learning in POMDP's via Direct Gradient Ascent international conference on machine learning. pp. 41- 48 ,(2000)
M Kearns, Y Mansour, S Singh, Nash convergence of gradient dynamics in general-sum games uncertainty in artificial intelligence. pp. 541- 548 ,(2000)
J. F. Nash, Equilibrium points in n-person games Proceedings of the National Academy of Sciences. ,vol. 36, pp. 48- 49 ,(1950) , 10.1073/PNAS.36.1.48
Craig Boutilier, Caroline Claus, The dynamics of reinforcement learning in cooperative multiagent systems national conference on artificial intelligence. pp. 746- 752 ,(1998)
Manuela Veloso, Michael Bowling, Existence of multiagent equilibria with limited agents Journal of Artificial Intelligence Research. ,vol. 22, pp. 353- 384 ,(2004) , 10.1613/JAIR.1332
Michael Bowling, Manuela Veloso, Multiagent learning using a variable learning rate Artificial Intelligence. ,vol. 136, pp. 215- 250 ,(2002) , 10.1016/S0004-3702(02)00121-2
Yishay Mansour, Satinder P. Singh, Richard S Sutton, David A. McAllester, Policy Gradient Methods for Reinforcement Learning with Function Approximation neural information processing systems. ,vol. 12, pp. 1057- 1063 ,(1999)