作者: Jennie Si , Andrew G Barto , Warren Buckler Powell , Don Wunsch , None
DOI:
关键词:
摘要: Learning and optimization of stochastic systems is a multi-disciplinary area that attracts researchers in control systems, operations research, and computer science. Areas such as perturbation analysis (PA), Markov decision processes (MDP), and reinforcement learning (RL) share a common goal. This chapter offers an overview of the area of learning and optimization from a system theoretic perspective, and it is shown that these seemingly different fields are actually closely related. Furthermore, this perspective leads to new research directions, which are illustrated using a queueing example. The central piece of this area is the performance potentials, which can be equivalently represented as perturbation realization factors that measure the effects of a single change to a sample path on the system performance. Potentials or realization factors can be used as building blocks to construct performance sensitivities …