Variable risk control via stochastic optimization

作者: Scott R Kuindersma , Roderic A Grupen , Andrew G Barto

DOI: 10.1177/0278364913476124

关键词:

摘要: We present new global and local policy search algorithms suitable for problems with policy-dependent cost variance (or risk), a property in many robot control tasks. These exploit techniques non-parametric heteroscedastic regression to directly model the distribution of cost. For search, learned can be used as critic performing risk-sensitive gradient descent. Alternatively, decision-theoretic criteria applied globally select policies balance exploration exploitation principled way, or perform greedy minimization respect various criteria. This separation learning selection permits variable risk control, where risk-sensitivity flexibly adjusted appropriate selected at runtime without relearning. describe experiments dynamic stabilization manipulation mobile manipulator that demonstrate flexible, very few trials.

参考文章(1)