Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning

作者: Scott Niekum , Supawit Chockchowwat , Caleb Chuck

DOI:

关键词:

摘要: Deep reinforcement learning (DRL) is capable of learning high-performing policies on a variety of complex high-dimensional tasks, ranging from video games to robotic …

参考文章(46)
Joel Veness, Michael Bowling, Marc G. Bellemare, Investigating contingency awareness using Atari 2600 games national conference on artificial intelligence. pp. 864- 871 ,(2012)
Scott Niekum, Sarah Osentoski, Christopher G. Atkeson, Andrew G. Barto, Online Bayesian changepoint detection for articulated motion models international conference on robotics and automation. pp. 1468- 1475 ,(2015) , 10.1109/ICRA.2015.7139383
Michael D. Escobar, Estimating Normal Means with a Dirichlet Process Prior Journal of the American Statistical Association. ,vol. 89, pp. 268- 277 ,(1994) , 10.1080/01621459.1994.10476468
Carlos Diuk, Andre Cohen, Michael L. Littman, An object-oriented representation for efficient reinforcement learning Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 240- 247 ,(2008) , 10.1145/1390156.1390187
N. Hansen, A. Ostermeier, Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation ieee international conference on evolutionary computation. pp. 312- 317 ,(1996) , 10.1109/ICEC.1996.542381
George Konidaris, Andre S. Barreto, Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining neural information processing systems. ,vol. 22, pp. 1015- 1023 ,(2009)
Richard S. Sutton, Doina Precup, Satinder Singh, Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning Artificial Intelligence. ,vol. 112, pp. 181- 211 ,(1999) , 10.1016/S0004-3702(99)00052-1
Gerhard Neumann, Oliver Kroemer, Jan Peters, Christian Daniel, Hierarchical Relative Entropy Policy Search ,(2014)
Todd Hester, Peter Stone, Real time targeted exploration in large domains international conference on development and learning. pp. 191- 196 ,(2010) , 10.1109/DEVLRN.2010.5578845
Özgür Şimşek, Andre S. Barreto, Skill Characterization Based on Betweenness neural information processing systems. ,vol. 21, pp. 1497- 1504 ,(2008)