Learning non-myopically from human-generated reward

W. Bradley Knox , Peter Stone
Proceedings of the 2013 international conference on Intelligent user interfaces - IUI '13 191 -202

14
2013
Inter-Classifier Feedback for Human-Robot Interaction in a Domestic Setting

Juhyun Lee , W. Bradley Knox , Peter Stone
Journal of Physical Agents (JoPha) 2 ( 2) 41 -50

4
2008
Reinforcement learning from human reward: Discounting in episodic tasks

W. Bradley Knox , Peter Stone
2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication 878 -885

63
2012
Learning from Human Reward Benefits from Socio-competitive Feedback

Guangliang Li , Hayley Hung , Shimon Whiteson , W. Bradley Knox
Development and Learning and Epigenetic Robotics (ICDL-Epirob), 2014 Joint IEEE International Conferences on 93 -100

6
2014
Domestic Interaction on a Segway Base

W. Bradley Knox , Juhyun Lee , Peter Stone
RoboCup 2008: Robot Soccer World Cup XII 519 -531

2
2009
Social interaction for efficient agent learning from human reward

Guangliang Li , Shimon Whiteson , W. Bradley Knox , Hayley Hung
Autonomous Agents and Multi-Agent Systems 32 ( 1) 1 -25

13
2018
Using informative behavior to increase engagement while learning from human reward

Guangliang Li , Shimon Whiteson , W. Bradley Knox , Hayley Hung
Autonomous Agents and Multi-Agent Systems 30 ( 5) 826 -848

12
2016
The perils of trial-and-error reward design: misdesign through overfitting and invalid task specifications

Serena Booth , W Bradley Knox , Julie Shah , Scott Niekum
Smpte Journal 37 ( 5) 5920 -5929

2023
Models of human preference for learning reward functions

W Bradley Knox , Stephane Hatgis-Kessell , Serena Booth , Scott Niekum
arXiv preprint arXiv:2206.02231

5
2022
Learning optimal advantage from preferences and mistaking it for reward

W Bradley Knox , Stephane Hatgis-Kessell , Sigurdur Orn Adalgeirsson , Serena Booth
Proceedings of the AAAI Conference on Artificial Intelligence 38 ( 9) 10066 -10073

2
2024
Contrastive prefence learning: Learning from human feedback without rl

Joey Hejna , Rafael Rafailov , Harshit Sikchi , Chelsea Finn
arXiv preprint arXiv:2310.13639

20
2023
Understanding human teaching modalities in reinforcement learning environments: A preliminary report

W Bradley Knox , Matthew E Taylor , Peter Stone
IJCAI 2011 Workshop on Agents Learning Interactively from Human Teachers (ALIHT)

13
2011
Training a robot via human feedback: A case study

W Bradley Knox , Peter Stone , Cynthia Breazeal
Social Robotics: 5th International Conference, ICSR 2013, Bristol, UK, October 27-29, 2013, Proceedings 5 460 -470

174
2013
Design Principles for Creating Human-Shapable Agents.

W Bradley Knox , Ian R Fasel , Peter Stone
AAAI Spring Symposium: Agents that Learn from Human Teachers 79 -86

34
2009
Teaching agents with human feedback: a demonstration of the tamer framework

W Bradley Knox , Peter Stone , Cynthia Breazeal
65 -66

23
2013
Learning from feedback on actions past and intended

W Bradley Knox , Cynthia Breazeal , Peter Stone
In Proceedings of 7th ACM/IEEE International Conference on Human-Robot Interaction, Late-Breaking Reports Session (HRI 2012)

20
2012