作者: Hunor S. Jakab , Lehel Csató
DOI: 10.1007/978-3-319-09903-3_14
关键词:
摘要: We present a novel sparsification and value function approximation method for on-line reinforcement learning in continuous state action spaces. Our approach is based on the kernel least squares temporal difference algorithm. derive recursive version enhance algorithm with new mechanism topology obtained from proximity graphs. The - necessary to speed up computations favors datapoints minimizing divergence of target-function gradient, thereby also considering shape target function. performance our tested standard benchmark RL problem comparisons existing approaches are provided.