作者: Rudolf Lioutikov , Scott Niekum , Ufuk Topcu , Wonjoon Goo , Farzan Memarian
DOI:
关键词: Computer science 、 Function (mathematics) 、 Inference 、 Machine learning 、 Sample (statistics) 、 SIGNAL (programming language) 、 Reinforcement learning 、 Classifier (linguistics) 、 Artificial intelligence
摘要: … reward inference and policy update steps—the original sparse reward provides a selfsupervisory signal for reward … newly inferred, typically dense reward function. We introduce theory …