作者: Harshit Sikchi , Akanksha Saran , Wonjoon Goo , Scott Niekum
DOI:
关键词:
摘要: We propose a new framework for imitation learning-treating imitation as a two-player ranking-based Stackelberg game between a $\textit {policy} $ and a $\textit {reward} $ function. In …