Learning non-cooperative behaviour for dialogue agents

作者: Oliver Lemon , Ioannis Efstathiou

DOI: 10.3233/978-1-61499-419-0-999

关键词: BluffOrder (exchange)Perfect informationReinforcement learningInformation hidingAdversaryDeceptionVariety (cybernetics)Artificial intelligenceComputer science

摘要: Non-cooperative dialogue behaviour for artificial agents (e.g. deception and information hiding) has been identified as important in a variety of application areas, including education healthcare, but it not yet addressed using modern statistical approaches to agents. Deception also argued be requirement high-order intentionality AI. We develop evaluate agent Reinforcement Learning which learns perform non-cooperative moves order complete its own objectives stochastic trading game with imperfect information. show that, when given the ability both cooperative moves, such an can learn bluff lie so win more games. For example, we that 10.5% games than strong rule-based adversary, compared optimised cannot moves. This work is first how use way meet their goals.

参考文章(1)
Damien Ernst, Arthur Louette, Introduction to Reinforcement Learning MIT Press. ,(1998)