Teaching Robots to Predict Human Motion

作者: Liang-Yan Gui , Kevin Zhang , Yu-Xiong Wang , Xiaodan Liang , Jose M. F. Moura

DOI: 10.1109/IROS.2018.8594452

关键词:

摘要: Teaching a robot to predict and mimic how human moves or acts in the near future by observing series of historical movements is crucial first step human-robot interaction collaboration. In this paper, we instrument with such prediction ability leveraging recent deep learning computer vision techniques. First, our system takes images from camera as input produce corresponding skeleton based on real-time pose estimation obtained OpenPose library. Then, conditioning sequence, forecasts plausible motion through predictor, generating demonstration. Because lack high-level fidelity validation, existing forecasting algorithms suffer error accumulation inaccurate prediction. Inspired generative adversarial networks (GANs), introduce global discriminator that examines whether predicted sequence smooth realistic. Our resulting GAN model achieves superior performance state-of-the-art approaches when evaluated standard H3.6M dataset. Based model, demonstrates its replay human-like manner interacting person.

参考文章(37)
Arthur Szlam, Emily Denton, Rob Fergus, Soumith Chintala, Deep generative image models using a Laplacian pyramid of adversarial networks neural information processing systems. ,vol. 28, pp. 1486- 1494 ,(2015)
Samy Bengio, Navdeep Jaitly, Noam Shazeer, Oriol Vinyals, Scheduled sampling for sequence prediction with recurrent Neural networks neural information processing systems. ,vol. 28, pp. 1171- 1179 ,(2015)
Hema S. Koppula, Ashutosh Saxena, Anticipating Human Activities Using Object Affordances for Reactive Robotic Response IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 38, pp. 14- 29 ,(2016) , 10.1109/TPAMI.2015.2430335
Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, Sanja Fidler, None, Skip-thought vectors neural information processing systems. ,vol. 28, pp. 3294- 3302 ,(2015)
Katerina Fragkiadaki, Sergey Levine, Panna Felsen, Jitendra Malik, Recurrent Network Models for Human Dynamics 2015 IEEE International Conference on Computer Vision (ICCV). pp. 4346- 4354 ,(2015) , 10.1109/ICCV.2015.494
Graham W. Taylor, Leonid Sigal, David J. Fleet, Geoffrey E. Hinton, Dynamical binary latent variable models for 3D human pose tracking computer vision and pattern recognition. pp. 631- 638 ,(2010) , 10.1109/CVPR.2010.5540157
Sepp Hochreiter, Jürgen Schmidhuber, Long short-term memory Neural Computation. ,vol. 9, pp. 1735- 1780 ,(1997) , 10.1162/NECO.1997.9.8.1735
Ijaz Akhter, Tomas Simon, Sohaib Khan, Iain Matthews, Yaser Sheikh, None, Bilinear spatiotemporal basis models ACM Transactions on Graphics. ,vol. 31, pp. 1- 12 ,(2012) , 10.1145/2159516.2159523
De-An Huang, Kris M. Kitani, Action-Reaction: Forecasting the Dynamics of Human Interaction european conference on computer vision. pp. 489- 504 ,(2014) , 10.1007/978-3-319-10584-0_32
, Generative Adversarial Nets neural information processing systems. ,vol. 27, pp. 2672- 2680 ,(2014) , 10.3156/JSOFT.29.5_177_2