作者: Changxi You , Jianbo Lu , Dimitar Filev , Panagiotis Tsiotras
DOI: 10.1016/J.ROBOT.2019.01.003
关键词:
摘要: Abstract Autonomous vehicles promise to improve traffic safety while, at the same time, increase fuel efficiency and reduce congestion. They represent main trend in future intelligent transportation systems. This paper concentrates on planning problem of autonomous traffic. We model interaction between vehicle environment as a stochastic Markov decision process (MDP) consider driving style an expert driver target be learned. The road geometry is taken into consideration MDP order incorporate more diverse styles. desired, expert-like behavior obtained follows: First, we design reward function corresponding determine optimal strategy for using reinforcement learning techniques. Second, collect number demonstrations from learn based data inverse learning. unknown approximated deep neural-network (DNN). clarify validate application maximum entropy principle (MEP) DNN function, provide necessary derivations parameterized feature (reward) function. Simulated results demonstrate desired behaviors both