Measuring and Influencing Sequential Joint Agent Behaviours

作者: Peter Abraham Raffensperger

DOI:

关键词:

摘要: Algorithmically designed reward functions can influence groups of learning agents toward measurable desired sequential joint behaviours. Influencing desirable behaviours is non-trivial due to the difficulties assigning credit for global success deserving and inducing coordination. Quantifying lets us identify by ranking some as more than others. We propose a real-valued metric turn-taking, demonstrating how measure one behaviour. describe presence turn-taking in simulation results we calculate quantity turntaking that could be observed between independent random agents. demonstrate our reinterpreting previous work on emergent communication analysing recorded human conversation. Given metric, explore space those result ‘medium access games’ model machine present an extensive range pairs Q-learning use Nash equilibria medium games develop predictors determining which turn-taking. Having demonstrated predictive power games, focus synthesis stochastic arbitrary equilibria. Our method constructs function such particular behaviour unique equilibrium game, provided exists. This builds techniques designing rewards Markov decision processes normal form games. explain design methods detail formally prove they are correct.

参考文章(171)
Melissa K. Jungers, Caroline Palmer, Shari R. Speer, Time after time: The coordinating influence of tempo in music and speech ,(2002)
Anssi Kainulainen, Jaakko Hakulinen, Markku Turunen, Evaluation of a spoken dialogue system with usability tests and long-term pilot studies: similarities and differences. conference of the international speech communication association. ,(2006)
John H. Andreae, Thinking with the teachable machine ,(1977)
Fred Kröger, Stephan Merz, Temporal logic and state systems Springer. pp. 436- ,(2008)
Andrew M. Colman, Lindsay Browning, Evolution of cooperative turn-taking Evolutionary Ecology Research. ,vol. 11, pp. 949- 963 ,(2009)
Craig Boutilier, Sequential Optimality and Coordination in Multiagent Systems international joint conference on artificial intelligence. pp. 478- 485 ,(1999)
Thomas G. Dietterich, The MAXQ Method for Hierarchical Reinforcement Learning international conference on machine learning. pp. 118- 126 ,(1998)
Arne Eigenfeldt, THE CREATION OF EVOLUTIONARY RHYTHMS WITHIN A MULTI-AGENT NETWORKED DRUM ENSEMBLE international computer music conference. ,vol. 2007, ,(2007)