Integrating Task-Motion Planning with Reinforcement Learning for Robust Decision Making in Mobile Robots

作者: Fangkai Yang , Peter Stone , Shiqi Zhang , Yuqian Jiang

DOI:

关键词:

摘要: Task-motion planning (TMP) addresses the problem of efficiently generating executable and low-cost task plans in a discrete space such that (initially unknown) action costs are determined by motion corresponding continuous space. However, task-motion plan can be sensitive to unexpected domain uncertainty changes, leading suboptimal behaviors or execution failures. In this paper, we propose novel framework, TMP-RL, which is an integration TMP reinforcement learning (RL) from experience, solve robust dynamic uncertain domains. TMP-RL features two nested planning-learning loops. inner loop, robot generates low-cost, feasible iteratively updating relevant evaluated planner outer executed, learns experience via model-free RL, further improve its plans. RL loop more accurate current but also expensive, using less costly leads jump-start for real world. Our approach on mobile service conducting navigation tasks office area. Results show significantly improves adaptability robustness (in comparison methods) rapid convergence (TP)-RL methods). We reuse learned values smoothly adapt new scenarios during long-term deployments.

参考文章(27)
S. Lavalle, Rapidly-exploring random trees : a new tool for path planning The annual research report. ,(1998)
Malik Ghallab, Dana S. Nau, Paolo Traverso, Automated Planning, Theory And Practice ,(2006)
Choset, Lynch, Hutchinson, Kantor, Burgard, Kavraki, Thrun, Principles of Robot Motion: Theory, Algorithms, and Implementations ,(2005)
Anton Schwartz, A reinforcement learning method for maximizing undiscounted rewards international conference on machine learning. pp. 298- 305 ,(1993) , 10.1016/B978-1-55860-307-3.50045-9
Chao Chen, Andre Gaschler, Markus Rickert, Alois Knoll, Task planning for highly automated driving ieee intelligent vehicles symposium. pp. 940- 945 ,(2015) , 10.1109/IVS.2015.7225805
Malcolm R. K. Ryan, Using Abstract Models of Behaviours to Automatically Generate Reinforcement Learning Hierarchies international conference on machine learning. pp. 522- 529 ,(2002)
Vladimir Lifschitz, What is answer set programming national conference on artificial intelligence. pp. 1594- 1597 ,(2008)
Martin Gebser, Benjamin Kaufmann, Torsten Schaub, Conflict-driven answer set solving: From theory to practice Artificial Intelligence. ,vol. 187, pp. 52- 89 ,(2012) , 10.1016/J.ARTINT.2012.04.001
Leslie Pack Kaelbling, Tomás Lozano-Pérez, Integrated task and motion planning in belief space The International Journal of Robotics Research. ,vol. 32, pp. 1194- 1227 ,(2013) , 10.1177/0278364913484072
Sridhar Mahadevan, Average reward reinforcement learning: foundations, algorithms, and empirical results Machine Learning. ,vol. 22, pp. 159- 195 ,(1996) , 10.1007/BF00114727