作者: Fangkai Yang , Peter Stone , Shiqi Zhang , Yuqian Jiang
DOI:
关键词:
摘要: Task-motion planning (TMP) addresses the problem of efficiently generating executable and low-cost task plans in a discrete space such that (initially unknown) action costs are determined by motion corresponding continuous space. However, task-motion plan can be sensitive to unexpected domain uncertainty changes, leading suboptimal behaviors or execution failures. In this paper, we propose novel framework, TMP-RL, which is an integration TMP reinforcement learning (RL) from experience, solve robust dynamic uncertain domains. TMP-RL features two nested planning-learning loops. inner loop, robot generates low-cost, feasible iteratively updating relevant evaluated planner outer executed, learns experience via model-free RL, further improve its plans. RL loop more accurate current but also expensive, using less costly leads jump-start for real world. Our approach on mobile service conducting navigation tasks office area. Results show significantly improves adaptability robustness (in comparison methods) rapid convergence (TP)-RL methods). We reuse learned values smoothly adapt new scenarios during long-term deployments.