Evaluating the Robustness of Natural Language Reward Shaping Models to Spatial Relations

作者: Antony Yun

DOI:

关键词:

摘要:

参考文章(7)
Andrew Y. Ng, Stuart J. Russell, Daishi Harada, Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping international conference on machine learning. pp. 278- 287 ,(1999)
Sepp Hochreiter, Jürgen Schmidhuber, Long short-term memory Neural Computation. ,vol. 9, pp. 1735- 1780 ,(1997) , 10.1162/NECO.1997.9.8.1735
A.G. Barto, R.S. Sutton, Reinforcement Learning: An Introduction ,(1988)
Prasoon Goyal, Scott Niekum, Raymond J. Mooney, Using Natural Language for Reward Shaping in Reinforcement Learning international joint conference on artificial intelligence. pp. 2385- 2391 ,(2019) , 10.24963/IJCAI.2019/331
Scott Niekum, Raymond J. Mooney, Prasoon Goyal, PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards arXiv: Learning. ,(2020)
Vincent François-Lavet, Joelle Pineau, Peter Henderson, Marc G. Bellemare, Riashat Islam, An Introduction to Deep Reinforcement Learning ,(2019)
Sham Machandranath Kakade, On the Sample Complexity of Reinforcement Learning Doctoral thesis, UCL (University College London).. ,(2003)