Universal Off-Policy Evaluation.

作者: Erik G. Learned-Miller , Emma Brunskill , Scott Niekum , Bruno Castro da Silva , Philip S. Thomas

DOI:

关键词:

摘要: When faced with sequential decision-making problems, it is often useful to be able to predict what would happen if decisions were made using a new policy. Those predictions must …

参考文章(0)