Recurrent 3D Pose Sequence Machines

作者: Xiaodan Liang , Liang Lin , Hui Cheng , Mude Lin , Keze Wang

DOI:

关键词:

摘要: 3D human articulated pose recovery from monocular image sequences is very challenging due to the diverse appearances, viewpoints, occlusions, and also inherently ambiguous imagery. It thus critical exploit rich spatial temporal long-range dependencies among body joints for accurate sequence prediction. Existing approaches usually manually design some elaborate prior terms kinematic constraints capturing structures, which are often insufficient all intrinsic structures not scalable scenarios. In contrast, this paper presents a Recurrent Pose Sequence Machine(RPSM) automatically learn image-dependent structural constraint sequence-dependent context by using multi-stage sequential refinement. At each stage, our RPSM composed of three modules predict based on previously learned 2D representations poses: (i) module extracting representations, (ii) recurrent regressing poses (iii) feature adaption serving as bridge between enable representation transformation domain. These then assembled into prediction framework refine predicted with multiple stages. Extensive evaluations Human3.6M dataset HumanEva-I show that outperforms state-of-the-art estimation.

参考文章(36)
Howard Rheingold, Virtual Reality: Exploring the Brave New Technologies Simon & Schuster Adult Publishing Group. ,(1991)
Ronan Collobert, Clément Farabet, Koray Kavukcuoglu, Torch7: A Matlab-like Environment for Machine Learning neural information processing systems. ,(2011)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
Bruce Xiaohan Nie, Caiming Xiong, Song-Chun Zhu, Joint action recognition and pose estimation from video computer vision and pattern recognition. pp. 1293- 1301 ,(2015) , 10.1109/CVPR.2015.7298734
Liang Chen, Yipeng Zhou, Dah Ming Chiu, Video Browsing - A Study of User Behavior in Online VoD Services international conference on computer communications and networks. pp. 1- 7 ,(2013) , 10.1109/ICCCN.2013.6614209
Yi Sun, Xiaogang Wang, Xiaoou Tang, Deep Convolutional Network Cascade for Facial Point Detection computer vision and pattern recognition. pp. 3476- 3483 ,(2013) , 10.1109/CVPR.2013.446
Mykhaylo Andriluka, Stefan Roth, Bernt Schiele, Monocular 3D pose estimation and tracking by detection computer vision and pattern recognition. pp. 623- 630 ,(2010) , 10.1109/CVPR.2010.5540156
Xavier Burgos-Artizzu, David Hall, Pietro Perona, Piotr Dollar, Merging pose estimates across space and time british machine vision conference. ,(2013) , 10.5244/C.27.58
Chunyu Wang, Yizhou Wang, Zhouchen Lin, Alan L. Yuille, Wen Gao, Robust Estimation of 3D Human Poses from a Single Image 2014 IEEE Conference on Computer Vision and Pattern Recognition. pp. 2369- 2376 ,(2014) , 10.1109/CVPR.2014.303
Ilya Kostrikov, Jüergen Gall, Depth Sweep Regression Forests for Estimating 3D Human Pose from Images. british machine vision conference. ,(2014) , 10.5244/C.28.80