PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards

作者: Scott Niekum , Raymond J. Mooney , Prasoon Goyal

DOI:

关键词: Reinforcement learningNatural languageDomain (software engineering)Computer scienceMachine learningPixelArtificial intelligenceStructure (mathematical logic)RobotTask (project management)Sample (statistics)

摘要: Reinforcement learning (RL), particularly in sparse reward settings, often requires prohibitively large numbers of interactions with the environment, thereby limiting its …

参考文章(31)
S.R.K. Branavan, D. Silver, R. Barzilay, Learning to Win by Reading Manuals in a Monte-Carlo Framework meeting of the association for computational linguistics. ,vol. 43, pp. 268- 277 ,(2011) , 10.1613/JAIR.3484
Andrew Y. Ng, Stuart J. Russell, Daishi Harada, Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping international conference on machine learning. pp. 278- 287 ,(1999)
Karthik Narasimhan, Tejas Kulkarni, Regina Barzilay, Language Understanding for Text-based Games using Deep Reinforcement Learning empirical methods in natural language processing. pp. 1- 11 ,(2015) , 10.18653/V1/D15-1001
Brenna D. Argall, Sonia Chernova, Manuela Veloso, Brett Browning, A survey of robot learning from demonstration Robotics and Autonomous Systems. ,vol. 57, pp. 469- 483 ,(2009) , 10.1016/J.ROBOT.2008.10.024
Shao Zhifei, Er Meng Joo, A survey of inverse reinforcement learning techniques International Journal of Intelligent Computing and Cybernetics. ,vol. 5, pp. 293- 311 ,(2012) , 10.1108/17563781211255862
S.R.K. Branavan, Regina Barzilay, Tao Lei, Nate Kushman, Learning High-Level Planning from Text meeting of the association for computational linguistics. pp. 126- 135 ,(2012)
Gregory Kuhlmann and Peter Stone and Raymond J. Mooney and Jude W. Shavlik, Guiding a Reinforcement Learner with Natural Language Advice: Initial Results in RoboCup Soccer national conference on artificial intelligence. ,(2004)
Thomas Kollar, Nicholas Roy, Steven Dickerson, Ashis Gopal Banerjee, Matthew R. Walter, Stefanie Tellex, Seth Teller, Understanding natural language commands for robotic navigation and mobile manipulation national conference on artificial intelligence. pp. 1507- 1514 ,(2011)
Michael Littman, James MacGlashan, Robert Loftin, Bei Peng, David Roberts, Matthew Taylor, Training an Agent to Ground Commands with Reward and Punishment national conference on artificial intelligence. ,(2014)
Christopher Sauer, Alexander Sosa, Russell Kaplan, Beating Atari with Natural Language Guided Reinforcement Learning. arXiv: Artificial Intelligence. ,(2017)