Automatic programming of behavior-based robots using reinforcement learning

作者: Sridhar Mahadevan , Jonathan Connell

DOI:

关键词:

摘要: This paper describes a general approach for automatically programming behavior-based robot. New behaviors are learned by trial and error using performance feedback function as reinforcement. Two algorithms behavior learning described that combine techniques propagating reinforcement values temporally across actions spatially states. A robot called OBELIX (see Figure 1) is learns several component in an example task involving pushing boxes. An experimental study the suggests two conclusions. One, able to learn individual behaviors, sometimes outperforming hand-coded program. Two, architecture better than monolithic box task.

参考文章(10)
Long-Ji Lin, Programming robots using reinforcement learning and teaching national conference on artificial intelligence. pp. 781- 786 ,(1991)
Jonathan H. Connell, Minimalist mobile robotics: a colony-style architecture for an artificial creature Academic Press Professional, Inc.. ,(1990)
Long-Ji Lin, Self-improving reactive agents: case studies of reinforcement learning frameworks simulation of adaptive behavior. pp. 297- 305 ,(1991)
Leslie Pack Kaelbling, Nils J. Nilsson, Learning in Embedded Systems ,(1993)
Steven D. Whitehead, Dana H. Ballard, Active perception and reinforcement learning Neural Computation. ,vol. 2, pp. 409- 419 ,(1990) , 10.1162/NECO.1990.2.4.409
R. Brooks, A robust layered control system for a mobile robot international conference on robotics and automation. ,vol. 2, pp. 204- 213 ,(1986) , 10.1109/JRA.1986.1087032
A.D. Christiansen, M.T. Mason, T.M. Mitchell, Learning reliable manipulation strategies without initial physical models international conference on robotics and automation. pp. 1224- 1230 ,(1990) , 10.1109/ROBOT.1990.126165
Rodney A. Brooks, Pattie Maes, Learning to coordinate behaviors national conference on artificial intelligence. pp. 796- 802 ,(1990)
C. J. C. H. Watkins, Learning from delayed rewards Ph. D thesis, Cambridge University Psychology Department. ,(1989)