作者: W. Bradley Knox , Peter Stone
DOI: 10.1109/ROMAN.2012.6343862
关键词: Discounting 、 Reinforcement learning 、 Credence 、 Trainer 、 Behavioural sciences 、 Machine learning 、 Task (project management) 、 Cognitive psychology 、 Artificial intelligence 、 Computer science 、 Space (commercial competition)
摘要: … for learning from human reward has hitherto not been explored systematically. Using model-based reinforcement learning … future rewards should be discounted to create behavior that …