Early Action Recognition With Category Exclusion Using Policy-Based Reinforcement Learning

作者: Junwu Weng , Xudong Jiang , Wei-Long Zheng , Junsong Yuan

DOI: 10.1109/TCSVT.2020.2976789

关键词:

摘要: The goal of early action recognition is to predict label when the sequence partially observed. existing methods treat task as sequential classification problems on different observation ratios an sequence. Since these models are trained by differentiating positive category from all negative classes, diverse information categories ignored, which we believe can be collected help improve performance. In this paper, step towards a new direction introducing exclusion recognition. We model mask operation probability output pre-trained classifier. Specifically, use policy-based reinforcement learning train agent. agent generates series binary masks exclude interfering during execution and hence accuracy. proposed method evaluated three benchmark datasets, NTU-RGBD, First-Person Hand Action, well UCF-101. enhances accuracy consistently over where improvements stages especially significant.

参考文章(62)
Amir Roshan Zamir, Khurram Soomro, Mubarak Shah, UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild arXiv: Computer Vision and Pattern Recognition. ,(2012)
Yu Kong, Dmitry Kit, Yun Fu, A Discriminative Model with Multiple Temporal Scales for Action Prediction Computer Vision – ECCV 2014. pp. 596- 611 ,(2014) , 10.1007/978-3-319-10602-1_39
Gang Yu, Zicheng Liu, Junsong Yuan, Discriminative Orderlet Mining for Real-Time Recognition of Human-Object Interaction asian conference on computer vision. pp. 50- 65 ,(2014) , 10.1007/978-3-319-16814-2_4
Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri, Learning Spatiotemporal Features with 3D Convolutional Networks 2015 IEEE International Conference on Computer Vision (ICCV). pp. 4489- 4497 ,(2015) , 10.1109/ICCV.2015.510
Yong Du, Wei Wang, Liang Wang, None, Hierarchical recurrent neural network for skeleton based action recognition computer vision and pattern recognition. pp. 1110- 1118 ,(2015) , 10.1109/CVPR.2015.7298714
Zhou Ren, Junsong Yuan, Jingjing Meng, Zhengyou Zhang, Robust Part-Based Hand Gesture Recognition Using Kinect Sensor IEEE Transactions on Multimedia. ,vol. 15, pp. 1110- 1120 ,(2013) , 10.1109/TMM.2013.2246148
Zhang Zhang, Dacheng Tao, Slow Feature Analysis for Human Action Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 34, pp. 436- 450 ,(2012) , 10.1109/TPAMI.2011.157
Ferda Ofli, Rizwan Chaudhry, Gregorij Kurillo, René Vidal, Ruzena Bajcsy, Sequence of the most informative joints (SMIJ) Journal of Visual Communication and Image Representation. ,vol. 25, pp. 24- 38 ,(2014) , 10.1016/J.JVCIR.2013.04.007
Alexander Klaser, Marcin Marszałek, Cordelia Schmid, A Spatio-Temporal Descriptor Based on 3D-Gradients british machine vision conference. pp. 1- 10 ,(2008) , 10.5244/C.22.99
Hui Liang, Junsong Yuan, Daniel Thalmann, Nadia Magnenat Thalmann, AR in Hand: Egocentric Palm Pose Tracking and Gesture Recognition for Augmented Reality Applications acm multimedia. pp. 743- 744 ,(2015) , 10.1145/2733373.2807972