Temporal Dynamic Graph LSTM for Action-driven Video Object Detection

作者: Abhinav Gupta , Yuan Yuan , Dit-Yan Yeung , Xiaodan Liang , Xiaolong Wang

DOI:

关键词:

摘要: … We evaluate the performance of both object detection and image classification tasks on Charades. For detection, we report the average precision (AP) at 50% intersection-over-union (…

参考文章(43)
C. Lawrence Zitnick, Piotr Dollár, Edge Boxes: Locating Object Proposals from Edges Computer Vision – ECCV 2014. pp. 391- 405 ,(2014) , 10.1007/978-3-319-10602-1_26
Amir Roshan Zamir, Khurram Soomro, Mubarak Shah, UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild arXiv: Computer Vision and Pattern Recognition. ,(2012)
Le Wang, Gang Hua, Rahul Sukthankar, Jianru Xue, Zhenxing Niu, Nanning Zheng, Video Object Discovery and Co-Segmentation with Extremely Weak Supervision IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 39, pp. 2074- 2088 ,(2017) , 10.1109/TPAMI.2016.2612187
Armand Joulin, Kevin Tang, Li Fei-Fei, Efficient Image and Video Co-localization with Frank-Wolfe Algorithm european conference on computer vision. pp. 253- 268 ,(2014) , 10.1007/978-3-319-10599-4_17
Chong Wang, Weiqiang Ren, Kaiqi Huang, Tieniu Tan, Weakly Supervised Object Localization with Latent Category Learning european conference on computer vision. pp. 431- 445 ,(2014) , 10.1007/978-3-319-10599-4_28
Ronan Collobert, Clément Farabet, Koray Kavukcuoglu, Torch7: A Matlab-like Environment for Machine Learning neural information processing systems. ,(2011)
Parthipan Siva, Chris Russell, Tao Xiang, In Defence of Negative Mining for Annotating Weakly Labelled Data Computer Vision – ECCV 2012. pp. 594- 608 ,(2012) , 10.1007/978-3-642-33712-3_43
Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, Juan Carlos Niebles, ActivityNet: A large-scale video benchmark for human activity understanding computer vision and pattern recognition. pp. 961- 970 ,(2015) , 10.1109/CVPR.2015.7298698
Thomas Deselaers, Bogdan Alexe, Vittorio Ferrari, Localizing Objects While Learning Their Appearance Computer Vision – ECCV 2010. pp. 452- 466 ,(2010) , 10.1007/978-3-642-15561-1_33
Alessandro Prest, C. Leistner, J. Civera, C. Schmid, V. Ferrari, Learning object class detectors from weakly annotated video computer vision and pattern recognition. pp. 3282- 3289 ,(2012) , 10.1109/CVPR.2012.6248065