Nonstrict Hierarchical Reinforcement Learning for Interactive Systems and Robots

作者: Heriberto Cuayáhuitl , Ivana Kruijff-Korbayová , Nina Dethlefs

DOI: 10.1145/2659003

关键词:

摘要: Conversational systems and robots that use reinforcement learning for policy optimization in large domains often face the problem of limited scalability. This has been addressed either by using function approximation techniques estimate approximate true value a or hierarchical decomposition task into subtasks. We present novel approach dialogue combines benefits both control allows flexible transitions between subtasks to give human users more over dialogue. To this end, each agent hierarchy is extended with subtask transition dynamic state space allow switching subdialogues. In addition, policies are represented linear order generalize decision making situations unseen training. Our proposed evaluated an interactive conversational robot learns play quiz games. Experimental results, simulation real users, provide evidence our can lead (natural) interactions than strict it preferred users.

参考文章(67)
Cynthia Breazeal, Andrea L. Thomaz, Reinforcement learning with human teachers: evidence of feedback and guidance with implications for learning performance national conference on artificial intelligence. pp. 1000- 1005 ,(2006)
Lutz Frommberger, Nina Dethlefs, Martijn van Otterlo, Heriberto Cuayáhuitl, Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication MLIS '13 Workshop on Machine Learning for Interactive Systems. ,(2013)
Oliver Lemon, Hiroshi Shimodaira, Steve Renals, Heriberto Cuayáhuitl, Hierarchical dialogue optimization using semi-markov decision processes. conference of the international speech communication association. pp. 2693- 2696 ,(2007)
Lihong Li, Jason D. Williams, Suhrid Balakrishnan, Reinforcement learning for dialog management using least-squares Policy iteration and fast feature selection. conference of the international speech communication association. pp. 2475- 2478 ,(2009)
Oliver Lemon, William Mackaness, Phil Bartie, Srinivasan Janarthanam, Xingkun Liu, Jana Goetze, Tiphaine Dalmas, Integrating Location, Visibility, and Question-Answering in a Spoken Dialogue System for Pedestrian City Exploration annual meeting of the special interest group on discourse and dialogue. pp. 134- 136 ,(2012)
Damien Ernst, Arthur Louette, Introduction to Reinforcement Learning MIT Press. ,(1998)
Heriberto Cuayáhuitl, Hierarchical Reinforcement Learning for Spoken Dialogue Systems University of Edinburgh. ,(2009)
Thomas G. Dietterich, An Overview of MAXQ Hierarchical Reinforcement Learning symposium on abstraction reformulation and approximation. pp. 26- 44 ,(2000) , 10.1007/3-540-44914-0_2