A Multi-Scale Hierarchical Codebook Method for Human Action Recognition in Videos Using a Single Example

作者: Mehrsan Javan Roshtkhari , Martin D. Levine

DOI: 10.1109/CRV.2012.32

关键词:

摘要: This paper presents a novel action matching method based on hierarchical codebook of local spatio-temporal video volumes (STVs). Given single example an activity as query video, the proposed finds similar videos to in dataset. It is bag words (BOV) representation and does not require prior knowledge about actions, background subtraction, motion estimation or tracking. also robust spatial temporal scale changes, well some deformations. The algorithm yields compact subset salient code STVs for then likelihood similarity between all target measured using probabilistic inference mechanism. hierarchy achieved by initially constructing STVs, while considering uncertainty construction, which always ignored current versions BOV approach. At second level hierarchy, large contextual region containing many (Ensemble STVs) considered order construct model their compositions. third formed ensembles similarities. latter are labels (code words) actions being exhibited video. Finally, at highest selected analyzing high assigned each image pixel function time. was applied three available datasets recognition with different complexities (KTH, Weizmann, MSR II) results were superior other approaches, especially cases training cross-dataset recognition.

参考文章(24)
Angela Yao, Jue Gall, Luc Van Gool, A Hough transform-based voting framework for action recognition 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp. 2061- 2068 ,(2010) , 10.1109/CVPR.2010.5539883
Silvio Savarese, Andrey DelPozo, Juan Carlos Niebles, Li Fei-Fei, Spatial-Temporal correlatons for unsupervised action classification ieee workshop on motion and video computing. pp. 1- 8 ,(2008) , 10.1109/WMVC.2008.4544068
Heng Wang, Muhammad Muneeb Ullah, Alexander Klaser, Ivan Laptev, Cordelia Schmid, Evaluation of local spatio-temporal features for action recognition british machine vision conference. pp. 1- 11 ,(2009) , 10.5244/C.23.124
Adriana Kovashka, Kristen Grauman, Learning a hierarchy of discriminative space-time neighborhood features for human action recognition 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp. 2046- 2053 ,(2010) , 10.1109/CVPR.2010.5539881
M. Blank, L. Gorelick, E. Shechtman, M. Irani, R. Basri, Actions as space-time shapes international conference on computer vision. ,vol. 2, pp. 1395- 1402 ,(2005) , 10.1109/ICCV.2005.28
Oren Boiman, Michal Irani, Detecting Irregularities in Images and in Video International Journal of Computer Vision. ,vol. 74, pp. 17- 31 ,(2007) , 10.1007/S11263-006-0009-9
Yan Ke, Rahul Sukthankar, Martial Hebert, Volumetric Features for Video Event Detection International Journal of Computer Vision. ,vol. 88, pp. 339- 362 ,(2010) , 10.1007/S11263-009-0308-Z
C. Schuldt, I. Laptev, B. Caputo, Recognizing human actions: a local SVM approach international conference on pattern recognition. ,vol. 3, pp. 32- 36 ,(2004) , 10.1109/ICPR.2004.747
Tuan Hue Thi, Li Cheng, Jian Zhang, Li Wang, Shinichi Satoh, Integrating local action elements for action analysis Computer Vision and Image Understanding. ,vol. 116, pp. 378- 395 ,(2012) , 10.1016/J.CVIU.2011.09.007
Zhuoyuan Chen, Ying Wu, Jiang Wang, Action recognition with multiscale spatio-temporal contexts CVPR 2011. pp. 3185- 3192 ,(2011) , 10.1109/CVPR.2011.5995493