Distributed Reinforcement Learning for Cooperative Multi-Robot Object Manipulation

作者: Bram Vanderborght , Lijun Chen , Marco M. Nicotra , Alessandro Roncone , Christoffer Heckman

DOI:

关键词:

摘要: … solving a cooperative multi-robot object … system of two agents in this paper, both DA-RL and GT-RL apply to general multi-agent systems, and are expected to scale well to large systems…

参考文章(15)
Jerzy Filar, Koos Vrieze, Competitive Markov decision processes Springer-Verlag New York, Inc.. ,(1996) , 10.1007/978-1-4612-4054-9
Michael P. Wellman, Junling Hu, Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm international conference on machine learning. pp. 242- 250 ,(1998)
Michael L. Littman, Markov games as a framework for multi-agent reinforcement learning Machine Learning Proceedings 1994. pp. 157- 163 ,(1994) , 10.1016/B978-1-55860-335-6.50027-1
Ming Tan, Multi-agent reinforcement learning: independent vs. cooperative agents international conference on machine learning. pp. 487- 494 ,(1997) , 10.1016/B978-1-55860-307-3.50049-6
Laetitia Matignon, Guillaume J. Laurent, Nadine Le Fort-Piat, Review: independent reinforcement learners in cooperative markov games: A survey regarding coordination problems Knowledge Engineering Review. ,vol. 27, pp. 1- 31 ,(2012) , 10.1017/S0269888912000057
Michael P. Wellman, Junling Hu, Nash q-learning for general-sum stochastic games Journal of Machine Learning Research. ,vol. 4, pp. 1039- 1069 ,(2003) , 10.5555/945365.964288
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis, None, Human-level control through deep reinforcement learning Nature. ,vol. 518, pp. 529- 533 ,(2015) , 10.1038/NATURE14236
Matthijs T. J. Spaan, Frans A. Oliehoek, Nikos Vlassis, Optimal and approximate Q-value functions for decentralized POMDPs Journal of Artificial Intelligence Research. ,vol. 32, pp. 289- 353 ,(2008) , 10.1613/JAIR.2447
N. Koenig, A. Howard, Design and use paradigms for Gazebo, an open-source multi-robot simulator intelligent robots and systems. ,vol. 3, pp. 2149- 2154 ,(2004) , 10.1109/IROS.2004.1389727
Landon Kraemer, Bikramjit Banerjee, Multi-agent reinforcement learning as a rehearsal for decentralized planning Neurocomputing. ,vol. 190, pp. 82- 94 ,(2016) , 10.1016/J.NEUCOM.2016.01.031