Policy evaluation using the Ω-return

作者： Georgios Theocharous , George Konidaris , Scott Niekum , Philip S. Thomas

DOI:

关键词:

摘要: We propose the Ω-return as an alternative to λ-return currently used by TD(λ) family of algorithms. The benefit is that it accounts for correlation different length returns. Because difficult compute exactly, we suggest one way approximating Ω-return. provide empirical studies superior and γ-return a variety problems.