Reinforcement learning from human reward: Discounting in episodic tasks

DOI: 10.1109/ROMAN.2012.6343862

关键词: Discounting 、 Reinforcement learning 、 Credence 、 Trainer 、 Behavioural sciences 、 Machine learning 、 Task (project management) 、 Cognitive psychology 、 Artificial intelligence 、 Computer science 、 Space (commercial competition)

摘要: … for learning from human reward has hitherto not been explored systematically. Using model-based reinforcement learning … future rewards should be discounted to create behavior that …

参考文章(3)

Andrew Y. Ng, Stuart J. Russell, Daishi Harada, Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping international conference on machine learning. pp. 278- 287 ,(1999)

Brenna D. Argall, Sonia Chernova, Manuela Veloso, Brett Browning, A survey of robot learning from demonstration Robotics and Autonomous Systems. ,vol. 57, pp. 469- 483 ,(2009) , 10.1016/J.ROBOT.2008.10.024

A.G. Barto, R.S. Sutton, Reinforcement Learning: An Introduction ,(1988)

Reinforcement learning from human reward: Discounting in episodic tasks

来源期刊

我的账户

Reinforcement learning from human reward: Discounting in episodic tasks

来源期刊

相似文章 1

Training a robot with evaluative feedback and unlabeled guidance signals

我的账户