作者: Herke van Hoof , Felix Schmitt , Jan Wöhlke
DOI:
关键词:
摘要: Sparse reward problems present a challenge for reinforcement learning (RL) agents. Previous work has shown that choosing start states according to curriculum can significantly improve the performance. We observe many existing generation algorithms rely on two key components: Performance measure estimation and selection policy. Therefore, we propose unifying framework performance-based state curricula in RL, which allows analyze compare performance influence of components. Furthermore, new policy using spatial gradients is introduced. conduct extensive empirical evaluations investigate model choice estimation. Benchmarking difficult robotic navigation tasks high-dimensional manipulation task, demonstrate state-of-the-art our novel gradient curriculum.