作者: Georgios Theocharous , George Konidaris , Scott Niekum , Philip S. Thomas
DOI:
关键词:
摘要: We propose the Ω-return as an alternative to λ-return currently used by TD(λ) family of algorithms. The benefit is that it accounts for correlation different length returns. Because difficult compute exactly, we suggest one way approximating Ω-return. provide empirical studies superior and γ-return a variety problems.