Softstar: heuristic-guided probabilistic inference

作者: Brian D. Ziebart , Patrick Lucey , Brenden M. Lake , Joshua B. Tenenbaum , Mathew Monfort

DOI:

关键词:

摘要: Recent machine learning methods for sequential behavior prediction estimate the motives of rather than itself. This higher-level abstraction improves generalization in different settings, but computing predictions often becomes intractable large decision spaces. We propose Softstar algorithm, a softened heuristic-guided search technique maximum entropy inverse optimal control model behavior. approach supports probabilistic with bounded approximation error at significantly reduced computational cost when compared to sampling based methods. present analyze guarantees, and compare performance simulation-based inference on two distinct complex tasks.

参考文章(25)
Ilya Sutskever, Geoffrey Hinton, James Martens, George Dahl, On the importance of initialization and momentum in deep learning international conference on machine learning. pp. 1139- 1147 ,(2013)
Eyal Amir, Deepak Ramachandran, Bayesian inverse reinforcement learning international joint conference on artificial intelligence. ,vol. 51, pp. 2586- 2591 ,(2007)
Peter Hart, Nils Nilsson, Bertram Raphael, A Formal Basis for the Heuristic Determination of Minimum Cost Paths IEEE Transactions on Systems Science and Cybernetics. ,vol. 4, pp. 100- 107 ,(1968) , 10.1109/TSSC.1968.300136
R. E. Kalman, When Is a Linear Control System Optimal Journal of Basic Engineering. ,vol. 86, pp. 51- 60 ,(1964) , 10.1115/1.3653115
Pieter Abbeel, Andrew Y. Ng, Apprenticeship learning via inverse reinforcement learning Twenty-first international conference on Machine learning - ICML '04. pp. 1- 8 ,(2004) , 10.1145/1015330.1015430
Jacqueline J. Goodnow, Rochelle A. Levine, “The grammar of action”: Sequence and syntax in children's copying Cognitive Psychology. ,vol. 4, pp. 82- 98 ,(1973) , 10.1016/0010-0285(73)90005-4
Andrew Y Ng, Stuart Russell, None, Algorithms for Inverse Reinforcement Learning international conference on machine learning. ,vol. 67, pp. 663- 670 ,(2000) , 10.2460/AJVR.67.2.323
Richard Bellman, A Markovian Decision Process Indiana University Mathematics Journal. ,vol. 6, pp. 679- 684 ,(1957) , 10.1512/IUMJ.1957.6.56038
Brian D. Ziebart, J. Andrew Bagnell, Anind K. Dey, Andrew Maas, Maximum entropy inverse reinforcement learning national conference on artificial intelligence. pp. 1433- 1438 ,(2008)