A joint imitation-reinforcement learning framework for reduced baseline regret

作者: Sheelabhadra Dey , Sumedh Pendurkar , Guni Sharon , Josiah P Hanna

DOI:

关键词:

摘要: In various control task domains, existing controllers provide a baseline level of performance that—though possibly suboptimal—should be maintained. Reinforcement learning (RL) …

参考文章(0)