摘要: Platt's resource-allocation network (RAN) (Platt, 1991a, 1991b) is modified for a reinforcement-learning paradigm and to "restart" existing hidden units rather than adding new units. After restarting, continue learn via back-propagation. The resulting restart algorithm tested in Q-learning that learns solve an inverted pendulum problem. Solutions are found faster on average with the without it.