Cooperative Control of Mobile Robots with Stackelberg Learning

作者: Lijun Chen , Alessandro Roncone , Christoffer Heckman , Guohui Ding , Joewie J. Koh

DOI: 10.1109/IROS45743.2020.9341376

关键词:

摘要: Multi-robot cooperation requires agents to make decisions that are consistent with the shared goal without disregarding action-specific preferences might arise from asymmetry in capabilities and individual objectives. To accomplish this goal, we propose a method named SLiCC: Stackelberg Learning Cooperative Control. SLiCC models problem as partially observable stochastic game composed of bimatrix games, uses deep reinforcement learning obtain payoff matrices associated these games. Appropriate cooperative actions then selected derived equilibria. Using bi-robot object transportation problem, validate performance against centralized multi-agent Q-learning demonstrate achieves better combined utility.

参考文章(30)
Mario F. M. Campos, Guilherme A. S. Pereira, R. Vijay Kumar, Aveek K Das, Decentralized motion planning for multiple robots subject to sensing and communication constraints ,(2003)
D. Pynadath, S. Marsella, R. Nair, M. Tambe, M. Yokoo, Taming decentralized POMDPs: towards efficient policy computation for multiagent settings international joint conference on artificial intelligence. pp. 705- 711 ,(2003)
Michael L. Littman, Markov games as a framework for multi-agent reinforcement learning Machine Learning Proceedings 1994. pp. 157- 163 ,(1994) , 10.1016/B978-1-55860-335-6.50027-1
O. Khatib, K. Yokoi, K. Chang, D. Ruspini, R. Holmberg, A. Casal, Vehicle/arm coordination and multiple mobile manipulator decentralized cooperation intelligent robots and systems. ,vol. 2, pp. 546- 553 ,(1996) , 10.1109/IROS.1996.570849
Ming Tan, Multi-agent reinforcement learning: independent vs. cooperative agents international conference on machine learning. pp. 487- 494 ,(1997) , 10.1016/B978-1-55860-307-3.50049-6
Jens Kober, J. Andrew Bagnell, Jan Peters, Reinforcement learning in robotics: A survey The International Journal of Robotics Research. ,vol. 32, pp. 1238- 1274 ,(2013) , 10.1177/0278364913495721
J.T. Feddema, C. Lewis, D.A. Schoenwald, Decentralized control of cooperative robotic vehicles: theory and application international conference on robotics and automation. ,vol. 18, pp. 852- 864 ,(2002) , 10.1109/TRA.2002.803466
Rosemary Emery-Montemerlo, Sebastian Thrun, Geoff Gordon, Jeff Schneider, Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs adaptive agents and multi-agents systems. ,vol. 2, pp. 136- 143 ,(2004) , 10.1109/AAMAS.2004.67
Michael P. Wellman, Junling Hu, Nash q-learning for general-sum stochastic games Journal of Machine Learning Research. ,vol. 4, pp. 1039- 1069 ,(2003) , 10.5555/945365.964288
Robert Fitch, Zack Butler, Million Module March: Scalable Locomotion for Large Self-Reconfiguring Robots The International Journal of Robotics Research. ,vol. 27, pp. 331- 343 ,(2008) , 10.1177/0278364907085097