作者: Sascha Lange , Thomas Gabel , Martin Riedmiller
DOI: 10.1007/978-3-642-27645-3_2
关键词:
摘要: Batch reinforcement learning is a subfield of dynamic programming-based learning. Originally defined as the task best possible policy from fixed set priori-known transition samples, (batch) algorithms developed in this field can be easily adapted to classical online case, where agent interacts with environment while Due efficient use collected data and stability process, research area has attracted lot attention recently. In chapter, we introduce basic principles theory behind batch learning, describe most important algorithms, exemplarily discuss ongoing within field, briefly survey real-world applications