Robust reinforcement learning control with static and dynamic stability

作者: R. Matthew Kretchmar , Peter M. Young , Charles W. Anderson , Douglas C. Hittle , Michael L. Anderson

DOI: 10.1002/RNC.670

关键词:

摘要: Robust control theory is used to design stable controllers in the presence of uncertainties. This provides powerful closed-loop robustness guarantees, but can result that are conservative with regard performance. Here we present an approach learning a better controller through observing actual controlled behaviour. A neural network placed parallel robust and trained reinforcement optimize performance over time. By analysing nonlinear time-varying aspects via uncertainty models, procedure results guaranteed remain even as being trained. The behaviour this demonstrated analysed on two tasks. Results show at intermediate stages system without constraints goes period unstable avoided when included. Copyright © 2001 John Wiley & Sons, Ltd.

参考文章(29)
Kemin Zhou, John Comstock Doyle, None, Essentials of Robust Control ,(1997)
Anders Rantzer, Alexander Megretski, System Analysis via Integral Quadratic Constraints. Part II Technical Reports; TFRT. ,vol. 7559, ,(1997)
A. G. Barto, R. S. Sutton, C. J.C.H. Watkins, Learning and Sequential Decision Making University of Massachusetts. ,(1989)
Gerald Tesauro, TD-Gammon, a self-teaching backgammon program, achieves master-level play Neural Computation. ,vol. 6, pp. 215- 219 ,(1994) , 10.1162/NECO.1994.6.2.215
M.M. Polycarpou, Stable adaptive neural control scheme for nonlinear systems IEEE Transactions on Automatic Control. ,vol. 41, pp. 447- 451 ,(1996) , 10.1109/9.486648
Johan A.K. Suykens, Bart L.R. De Moor, Joos Vandewalle, NL q theory: a neural control framework with global asymptotic stability criteria Neural Networks. ,vol. 10, pp. 615- 637 ,(1997) , 10.1016/S0893-6080(96)00104-9
Royce D. Habor, Charles L. Phillips, Feedback Control Systems ,(1969)
Anders Rantzer, On the Kalman-Yakubovich-Popov lemma Systems & Control Letters. ,vol. 28, pp. 7- 10 ,(1996) , 10.1016/0167-6911(95)00063-1
Andrew G. Barto, Richard S. Sutton, Charles W. Anderson, Neuronlike adaptive elements that can solve difficult learning control problems systems man and cybernetics. ,vol. 13, pp. 834- 846 ,(1983) , 10.1109/TSMC.1983.6313077
A. Megretski, A. Rantzer, System analysis via integral quadratic constraints IEEE Transactions on Automatic Control. ,vol. 42, pp. 819- 830 ,(1997) , 10.1109/9.587335