作者: Scott Niekum , Daniel S. Brown , Wonjoon Goo , Prabhat Nagarajan
DOI:
关键词: Benchmark (computing) 、 Machine learning 、 Ranking 、 Task (project management) 、 Noise (video) 、 Artificial intelligence 、 Function (engineering) 、 Reinforcement learning 、 Computer science 、 Set (psychology)
摘要: A critical flaw of existing inverse reinforcement learning (IRL) methods is their inability to significantly outperform the demonstrator. This is because IRL typically seeks a reward …