作者: Yu Wei , Minjia Mao , Xi Zhao , Jianhua Zou , Ping An
关键词: Heuristics 、 Mobile phone 、 Markov decision process 、 Field (computer science) 、 Reinforcement learning 、 Flow network 、 Operations research 、 Process (engineering) 、 Variance (accounting) 、 Computer science
摘要: City metro network expansion, included in the transportation design, aims to design new lines based on existing network. Existing methods field of either (i) can hardly formulate this problem efficiently, (ii) depend expert guidance produce solutions, or (iii) appeal problem-specific heuristics which are difficult design. To address these limitations, we propose a reinforcement learning method for city expansion problem. In method, line as Markov decision process (MDP), characterizes sequential station selection. Then, train an actor-critic model next basis The actor is encoder-decoder with attention mechanism generate parameterized policy used select stations. critic estimates expected cumulative reward assist training by reducing variance. proposed does not require during since procedure only relies calculation tune better Also, it avoids difficulty designing formalizing Considering origin-destination (OD) trips and social equity, expand current Xi'an, China, real mobility information 24,770,715 mobile phone users whole city. results demonstrate advantages our compared approaches.