Emergent Collective Behaviors in a Multi-agent Reinforcement Learning Pedestrian Simulation: A Case Study

作者: Francisco Martinez-Gil , Miguel Lozano , Fernando Fernández

DOI: 10.1007/978-3-319-14627-0_16

关键词: Probabilistic logicVector quantizationTransfer of learningCollective behaviorReinforcement learningBellman equationFunction approximationArtificial intelligenceComputer scienceSimulationKnowledge transfer

摘要: In this work, a Multi-agent Reinforcement Learning framework is used to generate simulations of virtual pedestrians groups. The aim study the influence two different learning approaches in quality generated simulations. case consists on simulation crossing groups embodied agents inside narrow corridor. This scenario classic experiment pedestrian modeling area, because collective behavior, specifically lanes formation, emerges with real pedestrians. paper studies algorithms, function approximation approaches, and knowledge transfer mechanisms performance learned behaviors. Specifically, RL-based schemas are analyzed. first one, Iterative Vector Quantization Q-Learning (ITVQQL), improves iteratively state-space generalizer based vector quantization. second scheme, named TS, uses tile coding as generalization method Sarsa(\(\lambda \)) algorithm. Knowledge approach use Probabilistic Policy Reuse incorporate previously acquired current processes; additionally, value also ITVQQL schema between consecutive iterations. Results demonstrate empirically that our RL generates individual behaviors capable emerging expected behavior occurred appears independently algorithm used, but depends extremely whether was applied or not. addition, techniques has remarkable final (measured number times task solved)

参考文章(24)
Francisco Martinez-Gil, Miguel Lozano, Fernando Fernández, Calibrating a Motion Model Based on Reinforcement Learning for Pedestrian Simulation Motion in Games. pp. 302- 313 ,(2012) , 10.1007/978-3-642-34710-8_28
Francisco Martinez-Gil, Miguel Lozano, Fernando Fernández, Multi-agent Reinforcement Learning for Simulating Pedestrian Navigation Adaptive and Learning Agents. pp. 54- 69 ,(2012) , 10.1007/978-3-642-28499-1_4
Michel Bierlaire, Gianluca Antonini, Javier Cruz, Thomas Robin, Specification, estimation and validation of a pedestrian walking behaviour model Seminars of the third cycle of swiss operational research. ,(2008)
Christopher J.C.H. Watkins, Peter Dayan, Technical Note Q-Learning Machine Learning. ,vol. 8, pp. 279- 292 ,(1992) , 10.1023/A:1022676722315
Mehdi Moussaid, Elsa G Guillot, Mathieu Moreau, Jérôme Fehrenbach, Olivier Chabiron, Samuel Lemercier, Julien Pettré, Cecile Appert-Rolland, Pierre Degond, Guy Theraulaz, None, Traffic Instabilities in Self-Organized Pedestrian Crowds PLoS Computational Biology. ,vol. 8, pp. e1002442- ,(2012) , 10.1371/JOURNAL.PCBI.1002442
Th. Robin, G. Antonini, M. Bierlaire, J. Cruz, Specification, estimation and validation of a pedestrian walking behavior model Transportation Research Part B: Methodological. ,vol. 43, pp. 36- 56 ,(2009) , 10.1016/J.TRB.2008.06.010
Nirajan Shiwakoti, Majid Sarvi, Geoff Rose, Martin Burd, Animal dynamics based approach for modeling pedestrian crowd egress under panic conditions Transportation Research Part B-methodological. ,vol. 45, pp. 1433- 1449 ,(2011) , 10.1016/J.TRB.2011.05.016
P. W. Anderson, More is different. Science. ,vol. 177, pp. 393- 396 ,(1972) , 10.1126/SCIENCE.177.4047.393
Dirk Helbing, Péter Molnár, Illés J Farkas, Kai Bolay, Self-Organizing Pedestrian Movement Environment and Planning B-planning & Design. ,vol. 28, pp. 361- 383 ,(2001) , 10.1068/B2697
David O'Sullivan, Mordechai Haklay, Agent-Based Models and Individualism: Is the World Agent-Based?: Environment and Planning A. ,vol. 32, pp. 1409- 1425 ,(2000) , 10.1068/A32140