作者: Jennie Si , Andrew G Barto , Warren Buckler Powell , Don Wunsch , None
DOI:
关键词:
摘要: This chapter focuses on learning to act in a near-optimal manner through reinforcement learning for problems that either have no model or whose model is very complex. The emphasis here is on continuous action space (CAS) methods. Monte-Carlo approaches are employed to estimate function values in an iterative, incremental procedure. Derivative-free line search methods are used to find a near-optimal action in the continuous action space for a discrete subset of the state space. This near-optimal policy is then extended to the entire continuous state space using a fuzzy additive model. To compensate for approximation errors, a modified procedure for perturbing the generated control policy is developed. Convergence results, under moderate assumptions and stopping criteria, are established. References to sucessful applications of the controller are provided.