Jointly Pre-training with Supervised, Autoencoder, and Value Losses for Deep Reinforcement Learning.

作者： Matthew E. Taylor , Gabriel Victor de la Cruz , Yunshu Du

DOI:

关键词:

摘要: Deep Reinforcement Learning (DRL) algorithms are known to be data inefficient. One reason is that a DRL agent learns both the feature and the policy tabula rasa. Integrating …

uni-trier.de 本地加速

arxiv.org 本地加速

arxiv.org PDF 下载加速

参考文章(34)

Bilal Piot, Matthieu Geist, Olivier Pietquin, Boosted bellman residual minimization handling expert Demonstrations european conference on machine learning. ,vol. 8725, pp. 549- 564 ,(2014) , 10.1007/978-3-662-44851-9_35

Long-Ji Lin, Reinforcement learning for robots using neural networks Carnegie Mellon University. ,(1992)

Brenna D. Argall, Sonia Chernova, Manuela Veloso, Brett Browning, A survey of robot learning from demonstration Robotics and Autonomous Systems. ,vol. 57, pp. 469- 483 ,(2009) , 10.1016/J.ROBOT.2008.10.024

Peter Stone, Matthew E. Taylor, Transfer Learning for Reinforcement Learning Domains: A Survey Journal of Machine Learning Research. ,vol. 10, pp. 1633- 1685 ,(2009) , 10.5555/1577069.1755839

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei, ImageNet Large Scale Visual Recognition Challenge International Journal of Computer Vision. ,vol. 115, pp. 211- 252 ,(2015) , 10.1007/S11263-015-0816-Y

A.G. Barto, R.S. Sutton, Reinforcement Learning: An Introduction ,(1988)

Doina Precup, Joelle Pineau, Amir massoud Farahmand, Beomjoon Kim, Learning from Limited Demonstrations neural information processing systems. ,vol. 26, pp. 2859- 2867 ,(2013)

Yoshua Bengio, Pierre-Antoine Manzagol, Samy Bengio, Dumitru Erhan, Aaron Courville, Pascal Vincent, Why Does Unsupervised Pre-training Help Deep Learning? Journal of Machine Learning Research. ,vol. 11, pp. 625- 660 ,(2010)

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis, None, Human-level control through deep reinforcement learning Nature. ,vol. 518, pp. 529- 533 ,(2015) , 10.1038/NATURE14236

10.

Stefan Schaal, Learning from Demonstration neural information processing systems. ,vol. 9, pp. 1040- 1046 ,(1996)

Jointly Pre-training with Supervised, Autoencoder, and Value Losses for Deep Reinforcement Learning.

来源期刊

我的账户

Jointly Pre-training with Supervised, Autoencoder, and Value Losses for Deep Reinforcement Learning.

来源期刊

相似文章 3

Anomaly Detection with SDAE.

Human-in-the-Loop Methods for Data-Driven and Reinforcement Learning Systems.

Lucid Dreaming for Experience Replay: Refreshing Past States with the Current Policy

我的账户