作者: Claude Sammut , Tak Fai Yik
DOI: 10.1007/978-3-642-05177-7_23
关键词: Reinforcement learning 、 Robot 、 Machine learning 、 Planner 、 Humanoid robot 、 Constraint satisfaction problem 、 Robot learning 、 Artificial intelligence 、 Degrees of freedom 、 Learning classifier system 、 Computer science
摘要: Pure reinforcement learning does not scale well to domains with many degrees of freedom and particularly continuous domains. In this paper, we introduce a hybrid method in which symbolic planner constructs an approximate solution control problem. Subsequently, numerical optimisation algorithm is used refine the qualitative plan into operational policy. The demonstrated on problem stable walking gait for bipedal robot. We use approach illustrate benefits multistrategy robot learning.