A deep Coarse-to-Fine network for head pose estimation from synthetic data

作者: Yujia Wang , Wei Liang , Jianbing Shen , Yunde Jia , Lap-Fai Yu

DOI: 10.1016/J.PATCOG.2019.05.026

关键词:

摘要: Abstract Various applications of human-computer interaction are based on the estimation head pose, which is challenging due to different facial appearance, inhomogeneous illumination, partial occlusion, etc. In this paper, we propose a deep neural network following Coarse-to-Fine strategy estimate poses. The scheme includes two branches: Coarse classification phase classifying input image into four categories, and Fine Regression estimating accurate pose parameters. sub-networks trained jointly. To tackle problem insufficient annotated data in training process, design rendering pipeline synthesize realistic images generate an dataset with collection 310k results benchmark datasets synthetic validate effectiveness our approach, as well diverse motion blur. Moreover, method can be easily extended poses depth images.

参考文章(46)
M Sai Praneeth, Xudong Peng, Alice Li, Shahrzad Hosseini Vajargah, Going deeper with convolutions computer vision and pattern recognition. pp. 1- 9 ,(2015) , 10.1109/CVPR.2015.7298594
Junwen Wu, Mohan M. Trivedi, A two-stage head pose estimation framework and evaluation Pattern Recognition. ,vol. 41, pp. 1138- 1158 ,(2008) , 10.1016/J.PATCOG.2007.07.017
M. La Cascia, S. Sclaroff, V. Athitsos, Fast, reliable head tracking under varying illumination: an approach based on registration of texture-mapped 3D models IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 22, pp. 322- 336 ,(2000) , 10.1109/34.845375
Jean-Marc Odobez, Elisa Ricci, Learning Large Margin Likelihood for Realtime Head Pose Tracking IEEE - IEEE Int. Conference on Image Processing, Cairo, Egypt. ,(2009)
Vincent Drouard, Sileye Ba, Georgios Evangelidis, Antoine Deleforge, Radu Horaud, Head pose estimation via probabilistic high-dimensional regression 2015 IEEE International Conference on Image Processing (ICIP). pp. 4624- 4628 ,(2015) , 10.1109/ICIP.2015.7351683
Byungtae Ahn, Jaesik Park, In So Kweon, Real-Time Head Orientation from a Monocular Camera Using Deep Neural Network asian conference on computer vision. pp. 82- 96 ,(2014) , 10.1007/978-3-319-16811-1_6
Chaoqun Hong, Jun Yu, Dacheng Tao, Meng Wang, Image-Based Three-Dimensional Human Pose Recovery by Multiview Locality-Sensitive Sparse Retrieval IEEE Transactions on Industrial Electronics. ,vol. 62, pp. 3742- 3751 ,(2015) , 10.1109/TIE.2014.2378735
Sankha S. Mukherjee, Neil Martin Robertson, Deep Head Pose: Gaze-Direction Estimation in Multimodal Video IEEE Transactions on Multimedia. ,vol. 17, pp. 2094- 2107 ,(2015) , 10.1109/TMM.2015.2482819
Chaoqun Hong, Jun Yu, Jian Wan, Dacheng Tao, Meng Wang, Multimodal Deep Autoencoder for Human Pose Recovery IEEE Transactions on Image Processing. ,vol. 24, pp. 5659- 5670 ,(2015) , 10.1109/TIP.2015.2487860
Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt, Gang Hua, A convolutional neural network cascade for face detection computer vision and pattern recognition. pp. 5325- 5334 ,(2015) , 10.1109/CVPR.2015.7299170