Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching

作者: Fengjun Lv , Ramakant Nevatia

DOI: 10.1109/CVPR.2007.383131

关键词: Kernel (image processing)ViewpointsGraph theoryMathematicsPattern recognitionSingle viewComputer visionSilhouetteArtificial intelligenceAction recognitionViterbi algorithmPose

摘要: 3D human pose recovery is considered as a fundamental step in view-invariant action recognition. However, inferring poses from single view usually slow due to the large number of parameters that need be estimated and recovered are often ambiguous perspective projection. We present an approach does not explicitly infer at each frame. Instead, existing models we search for series actions best match input sequence. In our approach, modeled synthetic 2D rendered wide range viewpoints. The constraints on transition represented by graph model called Action Net. Given input, silhouette matching between frames key performed first using enhanced Pyramid Match Kernel algorithm. matched sequence then tracked Viterbi demonstrate this challenging video sets consisting 15 complex classes.

参考文章(20)
M. Blank, L. Gorelick, E. Shechtman, M. Irani, R. Basri, Actions as space-time shapes international conference on computer vision. ,vol. 2, pp. 1395- 1402 ,(2005) , 10.1109/ICCV.2005.28
Remi Ronfard, Edmond Boyer, Daniel Weinland, Free viewpoint action recognition using motion history volumes Computer Vision and Image Understanding. ,vol. 104, pp. 249- 257 ,(2006) , 10.1016/J.CVIU.2006.07.013
C. Sminchisescu, A. Kanaujia, Zhiguo Li, D. Metaxas, Conditional models for contextual human motion recognition international conference on computer vision. ,vol. 2, pp. 1808- 1815 ,(2005) , 10.1109/ICCV.2005.59
C. S. Myers, L. R. Rabiner, A Comparative Study of Several Dynamic Time-Warping Algorithms for Connected-Word Recognition Bell System Technical Journal. ,vol. 60, pp. 1389- 1409 ,(1981) , 10.1002/J.1538-7305.1981.TB00272.X
S. Belongie, J. Malik, J. Puzicha, Shape matching and object recognition using shape contexts IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 24, pp. 509- 522 ,(2002) , 10.1109/34.993558
Thomas B. Moeslund, Erik Granum, A Survey of Computer Vision-Based Human Motion Capture Computer Vision and Image Understanding. ,vol. 81, pp. 231- 268 ,(2001) , 10.1006/CVIU.2000.0897
Trevor Darrell, Kristen Grauman, Approximate Correspondences in High Dimensions neural information processing systems. ,vol. 19, pp. 505- 512 ,(2006)
C. Stauffer, W.E.L. Grimson, Adaptive background mixture models for real-time tracking computer vision and pattern recognition. ,vol. 2, pp. 246- 252 ,(1999) , 10.1109/CVPR.1999.784637
K. Grauman, T. Darrell, The pyramid match kernel: discriminative classification with sets of image features international conference on computer vision. ,vol. 2, pp. 1458- 1465 ,(2005) , 10.1109/ICCV.2005.239
R. Kehl, M. Bray, L. Van Gool, Full body tracking from multiple views using stochastic sampling computer vision and pattern recognition. ,vol. 2, pp. 129- 136 ,(2005) , 10.1109/CVPR.2005.165