Automatic programming of behavior-based robots using reinforcement learning

作者： Sridhar Mahadevan , Jonathan Connell

DOI:

关键词:

摘要: This paper describes a general approach for automatically programming behavior-based robot. New behaviors are learned by trial and error using performance feedback function as reinforcement. Two algorithms behavior learning described that combine techniques propagating reinforcement values temporally across actions spatially states. A robot called OBELIX (see Figure 1) is learns several component in an example task involving pushing boxes. An experimental study the suggests two conclusions. One, able to learn individual behaviors, sometimes outperforming hand-coded program. Two, architecture better than monolithic box task.

uni-trier.de 本地加速

暂无可下载资源，当前可以选择系统获取到有开放资源时通知我或者直接发起求助文献求助

参考文章(10)

Long-Ji Lin, Programming robots using reinforcement learning and teaching national conference on artificial intelligence. pp. 781- 786 ,(1991)

Richard S. Sutton, Integrated architecture for learning, planning, and reacting based on approximating dynamic programming international conference on machine learning. pp. 216- 224 ,(1990) , 10.1016/B978-1-55860-141-3.50030-4

Jonathan H. Connell, Minimalist mobile robotics: a colony-style architecture for an artificial creature Academic Press Professional, Inc.. ,(1990)

Long-Ji Lin, Self-improving reactive agents: case studies of reinforcement learning frameworks simulation of adaptive behavior. pp. 297- 305 ,(1991)

Leslie Pack Kaelbling, Nils J. Nilsson, Learning in Embedded Systems ,(1993)

Steven D. Whitehead, Dana H. Ballard, Active perception and reinforcement learning Neural Computation. ,vol. 2, pp. 409- 419 ,(1990) , 10.1162/NECO.1990.2.4.409

R. Brooks, A robust layered control system for a mobile robot international conference on robotics and automation. ,vol. 2, pp. 204- 213 ,(1986) , 10.1109/JRA.1986.1087032

A.D. Christiansen, M.T. Mason, T.M. Mitchell, Learning reliable manipulation strategies without initial physical models international conference on robotics and automation. pp. 1224- 1230 ,(1990) , 10.1109/ROBOT.1990.126165

Rodney A. Brooks, Pattie Maes, Learning to coordinate behaviors national conference on artificial intelligence. pp. 796- 802 ,(1990)

10.

C. J. C. H. Watkins, Learning from delayed rewards Ph. D thesis, Cambridge University Psychology Department. ,(1989)

Automatic programming of behavior-based robots using reinforcement learning

来源期刊

我的账户

Automatic programming of behavior-based robots using reinforcement learning

来源期刊

相似文章 10

我的账户