作者: Manuela Veloso , Michael Bowling
DOI:
关键词: Algorithmic learning theory 、 Active learning (machine learning) 、 Robot learning 、 Robot 、 Competitive learning 、 Artificial intelligence 、 Synchronous learning 、 Stability (learning theory) 、 Instance-based learning 、 Multi-task learning 、 Computer science 、 Proactive learning 、 Learning classifier system 、 Reinforcement learning
摘要: Multi-robot learning faces all of the challenges robot with multiagent learning. There has been a great deal recent research on reinforcement in stochastic games, which is intuitive extension MDPs to multiple agents. This work, although general, only applied small games at most hundreds states. On other hand tasks have continuous, and often complex, state action spaces. Robot demand approximation generalization techniques, received extensive attention single-agent In this paper we introduce GraWoLF, general-purpose, scalable, algorithm. It combines gradient-based policy techniques WoLF ("Win or Learn Fast") variable rate. We apply algorithm an adversarial multi-robot task simultaneous show results both simulation real robots. These demonstrate that GraWoLF can learn successful policies, overcoming many