Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning

作者： Scott Niekum , Supawit Chockchowwat , Caleb Chuck

DOI:

关键词:

摘要: Deep reinforcement learning (DRL) is capable of learning high-performing policies on a variety of complex high-dimensional tasks, ranging from video games to robotic …

arxiv.org 本地加速

ieee.org 本地加速

arxiv.org PDF 下载加速

参考文章(46)

Joel Veness, Michael Bowling, Marc G. Bellemare, Investigating contingency awareness using Atari 2600 games national conference on artificial intelligence. pp. 864- 871 ,(2012)

Scott Niekum, Sarah Osentoski, Christopher G. Atkeson, Andrew G. Barto, Online Bayesian changepoint detection for articulated motion models international conference on robotics and automation. pp. 1468- 1475 ,(2015) , 10.1109/ICRA.2015.7139383

Michael D. Escobar, Estimating Normal Means with a Dirichlet Process Prior Journal of the American Statistical Association. ,vol. 89, pp. 268- 277 ,(1994) , 10.1080/01621459.1994.10476468

Carlos Diuk, Andre Cohen, Michael L. Littman, An object-oriented representation for efficient reinforcement learning Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 240- 247 ,(2008) , 10.1145/1390156.1390187

N. Hansen, A. Ostermeier, Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation ieee international conference on evolutionary computation. pp. 312- 317 ,(1996) , 10.1109/ICEC.1996.542381

George Konidaris, Andre S. Barreto, Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining neural information processing systems. ,vol. 22, pp. 1015- 1023 ,(2009)

Richard S. Sutton, Doina Precup, Satinder Singh, Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning Artificial Intelligence. ,vol. 112, pp. 181- 211 ,(1999) , 10.1016/S0004-3702(99)00052-1

Gerhard Neumann, Oliver Kroemer, Jan Peters, Christian Daniel, Hierarchical Relative Entropy Policy Search ,(2014)

Todd Hester, Peter Stone, Real time targeted exploration in large domains international conference on development and learning. pp. 191- 196 ,(2010) , 10.1109/DEVLRN.2010.5578845

10.

Özgür Şimşek, Andre S. Barreto, Skill Characterization Based on Betweenness neural information processing systems. ,vol. 21, pp. 1497- 1504 ,(2008)

Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning

来源期刊

我的账户

Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning

来源期刊

相似文章 0

我的账户