Learning Rewards from Linguistic Feedback.

作者： Thomas L. Griffiths , Karthik Narasimhan , Mark K. Ho , Robert X. D. Hawkins , Theodore R. Sumers

DOI:

关键词:

摘要: We explore unconstrained natural language feedback as a learning signal for artificial agents. Humans use rich and varied language to teach, yet most prior work on interactive …

uni-trier.de PDF 下载加速

aaai.org PDF 下载加速

参考文章(70)

Alex Djalali, Sven Lauer, Christopher Potts, Corpus evidence for preference-driven interpretation AC'11 Proceedings of the 18th Amsterdam colloquim conference on Logic, Language and Meaning. pp. 150- 159 ,(2011) , 10.1007/978-3-642-31482-7_16

Thomas G. Dietterich, Alan Fern, Kshitij Judah, Saikat Roy, Reinforcement learning via practice and critique advice national conference on artificial intelligence. pp. 481- 486 ,(2010)

D. Mcfadden, Conditional logit analysis of qualitative choice behavior Frontiers in Econometrics. pp. 105- 142 ,(1972)

Eyal Amir, Deepak Ramachandran, Bayesian inverse reinforcement learning international joint conference on artificial intelligence. ,vol. 51, pp. 2586- 2591 ,(2007)

Yoon Kim, Convolutional Neural Networks for Sentence Classification empirical methods in natural language processing. pp. 1746- 1751 ,(2014) , 10.3115/V1/D14-1181

Brenna D. Argall, Sonia Chernova, Manuela Veloso, Brett Browning, A survey of robot learning from demonstration Robotics and Autonomous Systems. ,vol. 57, pp. 469- 483 ,(2009) , 10.1016/J.ROBOT.2008.10.024

Anca D. Dragan, Siddhartha S. Srinivasa, Kenton C.T. Lee, Legibility and predictability of robot motion human-robot interaction. pp. 301- 308 ,(2013) , 10.5555/2447556.2447672

Pieter Abbeel, Andrew Y. Ng, Apprenticeship learning via inverse reinforcement learning Twenty-first international conference on Machine learning - ICML '04. pp. 1- 8 ,(2004) , 10.1145/1015330.1015430

JOHN W. TUKEY, Some selected quick and easy methods of statistical analysis. Annals of the New York Academy of Sciences. ,vol. 16, pp. 88- 97 ,(1953) , 10.1111/J.2164-0947.1953.TB01326.X

10.

Luke Zettlemoyer, Yoav Artzi, Bootstrapping Semantic Parsers from Conversations empirical methods in natural language processing. pp. 421- 432 ,(2011)

Learning Rewards from Linguistic Feedback.

来源期刊

我的账户

Learning Rewards from Linguistic Feedback.

来源期刊

相似文章 3

Adapting a Language Model for Controlled Affective Text Generation

Adapting a Language Model for Controlled Affective Text Generation

Interactive Learning from Activity Description.

我的账户