Uncertainty-Aware Data Aggregation for Deep Imitation Learning

作者: Yuchen Cui , David Isele , Scott Niekum , Kikuo Fujimura

DOI: 10.1109/ICRA.2019.8794025

关键词: Data aggregatorBenchmark (computing)Autonomous agentMachine learningArtificial intelligenceTraining setTask analysisTask (computing)Dropout (neural networks)Control systemMonte Carlo methodData modelingComputer science

摘要: Estimating statistical uncertainties allows autonomous agents to communicate their confidence during task execution and is important for applications in safety-critical domains such as driving. In this work, we present the uncertainty-aware imitation learning (UAIL) algorithm improving end-to-end control systems via data aggregation. UAIL applies Monte Carlo Dropout estimate uncertainty output of systems, using states where it uncertain selectively acquire new training data. contrast prior aggregation algorithms that force human experts visit sub-optimal at random, can anticipate its own mistakes switch expert order prevent visiting a series states. Our experimental results from simulated driving tasks demonstrate our proposed estimation method be leveraged reliably predict infractions. analysis shows outperforms existing on benchmark tasks.

参考文章(30)
Yarin Gal, Zoubin Ghahramani, None, Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference arXiv: Machine Learning. ,(2015)
Thomas G. Dietterich, Ensemble Methods in Machine Learning Multiple Classifier Systems. pp. 1- 15 ,(2000) , 10.1007/3-540-45014-9_1
Christopher M. Bishop, Mixture density networks Aston University. ,(1994)
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, Victor Lempitsky, Domain-Adversarial Training of Neural Networks Domain Adaptation in Computer Vision Applications. ,vol. 17, pp. 189- 209 ,(2017) , 10.1007/978-3-319-58347-1_10
Ilya Sutskever, Geoffrey Hinton, Alex Krizhevsky, Ruslan Salakhutdinov, Nitish Srivastava, Dropout: a simple way to prevent neural networks from overfitting Journal of Machine Learning Research. ,vol. 15, pp. 1929- 1958 ,(2014)
Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla, Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding arXiv: Computer Vision and Pattern Recognition. ,(2015)
Eric Cosatto, Beat Flepp, Jan Ben, Urs Muller, Yann L. Cun, Off-Road Obstacle Avoidance through End-to-End Learning neural information processing systems. ,vol. 18, pp. 739- 746 ,(2005)
Ruslan R Salakhutdinov, Yichuan Tang, Learning Stochastic Feedforward Neural Networks neural information processing systems. ,vol. 26, pp. 530- 538 ,(2013)
Drew Bagnell, Stéphane Ross, Efficient Reductions for Imitation Learning international conference on artificial intelligence and statistics. pp. 661- 668 ,(2010)
S. Chernova, M. Veloso, Interactive policy learning through confidence-based autonomy Journal of Artificial Intelligence Research. ,vol. 34, pp. 1- 25 ,(2009) , 10.1613/JAIR.2584