Safe Exploration in Continuous Action Spaces

作者： Krishnamurthy Dvijotham , Yuval Tassa , Cosmin Paduraru , Todd Hester , Gal Dalal

DOI:

关键词:

摘要: … An appropriate Lyapunov function was identified for policy at… 5 depicts the drawbacks of reward shaping for ensuring safety. … the best reward shaping choice, and to no reward shaping at …

参考文章(16)

Bill Goodwine, Engineering Differential Equations: Theory and Applications ,(2010)

Martin Enqvist, Linear models of nonlinear systems Seminar presented at the Dept. of Automatic Control at Lund University, Sweden, May 11, 2006. ,(2005)

Eitan Altman, Constrained Markov Decision Processes ,(1999)

Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)

Philip S Thomas, Safe Reinforcement Learning ,(2015) , 10.7275/7529913.0

Guy Shani, David Heckerman, Ronen I Brafman, Craig Boutilier, An MDP-Based Recommender System Journal of Machine Learning Research. ,vol. 6, pp. 1265- 1295 ,(2005) , 10.5555/1046920.1088715

Peter L. Bartlett, Jonathan Baxter, Infinite-horizon policy-gradient estimation Journal of Artificial Intelligence Research. ,vol. 15, pp. 319- 350 ,(2001) , 10.1613/JAIR.806

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis, None, Human-level control through deep reinforcement learning Nature. ,vol. 518, pp. 529- 533 ,(2015) , 10.1038/NATURE14236

Emanuel Todorov, Tom Erez, Yuval Tassa, MuJoCo: A physics engine for model-based control intelligent robots and systems. pp. 5026- 5033 ,(2012) , 10.1109/IROS.2012.6386109

10.

Yuval Tassa, Daan Wierstra, Alexander Pritzel, Tom Erez, Jonathan J. Hunt, Nicolas Heess, David Silver, Timothy P. Lillicrap, Continuous control with deep reinforcement learning arXiv: Learning. ,(2015)

Safe Exploration in Continuous Action Spaces

来源期刊

我的账户

Safe Exploration in Continuous Action Spaces

来源期刊

相似文章 10

我的账户