作者: Michael Kositsky , Andrew G Barto
DOI: 10.1016/S0925-2312(02)00488-5
关键词:
摘要: Abstract Rapid human arm movements often have velocity profiles consisting of several bell-shaped acceleration–deceleration phases, sometimes overlapping in time and appearing separately. We show how such sub-movement sequences can emerge naturally as an optimal control policy is approximated by a reinforcement learning system the face uncertainty feedback delay. The learns to generate pulse-step commands, producing fast initial sub-movements followed slow corrective that begin before has completed. These results suggest nervous might efficiently stochastic motor plant under