PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards

作者： Scott Niekum , Raymond J. Mooney , Prasoon Goyal

DOI:

关键词: Reinforcement learning 、 Natural language 、 Domain (software engineering) 、 Computer science 、 Machine learning 、 Pixel 、 Artificial intelligence 、 Structure (mathematical logic) 、 Robot 、 Task (project management) 、 Sample (statistics)

摘要: Reinforcement learning (RL), particularly in sparse reward settings, often requires prohibitively large numbers of interactions with the environment, thereby limiting its …

uni-trier.de 本地加速

arxiv.org 本地加速

arxiv-vanity.com 本地加速

参考文章(31)

S.R.K. Branavan, D. Silver, R. Barzilay, Learning to Win by Reading Manuals in a Monte-Carlo Framework meeting of the association for computational linguistics. ,vol. 43, pp. 268- 277 ,(2011) , 10.1613/JAIR.3484

Andrew Y. Ng, Stuart J. Russell, Daishi Harada, Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping international conference on machine learning. pp. 278- 287 ,(1999)

Karthik Narasimhan, Tejas Kulkarni, Regina Barzilay, Language Understanding for Text-based Games using Deep Reinforcement Learning empirical methods in natural language processing. pp. 1- 11 ,(2015) , 10.18653/V1/D15-1001

Brenna D. Argall, Sonia Chernova, Manuela Veloso, Brett Browning, A survey of robot learning from demonstration Robotics and Autonomous Systems. ,vol. 57, pp. 469- 483 ,(2009) , 10.1016/J.ROBOT.2008.10.024

Shao Zhifei, Er Meng Joo, A survey of inverse reinforcement learning techniques International Journal of Intelligent Computing and Cybernetics. ,vol. 5, pp. 293- 311 ,(2012) , 10.1108/17563781211255862

S.R.K. Branavan, Regina Barzilay, Tao Lei, Nate Kushman, Learning High-Level Planning from Text meeting of the association for computational linguistics. pp. 126- 135 ,(2012)

Gregory Kuhlmann and Peter Stone and Raymond J. Mooney and Jude W. Shavlik, Guiding a Reinforcement Learner with Natural Language Advice: Initial Results in RoboCup Soccer national conference on artificial intelligence. ,(2004)

Thomas Kollar, Nicholas Roy, Steven Dickerson, Ashis Gopal Banerjee, Matthew R. Walter, Stefanie Tellex, Seth Teller, Understanding natural language commands for robotic navigation and mobile manipulation national conference on artificial intelligence. pp. 1507- 1514 ,(2011)

Michael Littman, James MacGlashan, Robert Loftin, Bei Peng, David Roberts, Matthew Taylor, Training an Agent to Ground Commands with Reward and Punishment national conference on artificial intelligence. ,(2014)

10.

Christopher Sauer, Alexander Sosa, Russell Kaplan, Beating Atari with Natural Language Guided Reinforcement Learning. arXiv: Artificial Intelligence. ,(2017)

PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards

来源期刊

我的账户

PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards

来源期刊

相似文章 2

Learning Rewards from Linguistic Feedback.

Evaluating the Robustness of Natural Language Reward Shaping Models to Spatial Relations

我的账户