作者: Deepak Ramachandran , Rakesh Gupta
DOI: 10.1109/ROBOT.2009.5152707
关键词:
摘要: … reinforcement learning algorithm called Smoothed Sarsa that learns a good policy for these delivery tasks by delaying the backup reinforcement … Smoothed Sarsa learns a policy orders …