Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss

作者: Yang Yang , Zheng-Jun Zha , Yue Gao , Xiaofeng Zhu , Tat-Seng Chua

DOI: 10.1109/TMM.2014.2323014

关键词:

摘要: Semantic video indexing, also known as annotation or concept detection in literatures, has been attracting significant attention recent years. Due to deficiency of labeled training videos, most the existing approaches can hardly achieve satisfactory performance. In this paper, we propose a novel semantic indexing approach, which exploits abundant user-tagged Web images help learn robust classifiers. The following two major challenges are well studied: 1) noisy with imprecise and/or incomplete tags; and 2) domain difference between videos. Specifically, first apply non-parametric approach estimate probabilities being correctly tagged confidence scores. We then develop transfer (RTVI) model reliable classifiers from limited number videos together abundance images. RTVI is equipped sample-specific loss function, employs score image prior knowledge suppress influence control contribution learning process. Meanwhile, discovers an optimal kernel space, mismatch minimized for tackling problem. Besides, devise iterative algorithm effectively optimize proposed theoretical analysis on convergence provided well. Extensive experiments various real-world multimedia collections demonstrate effectiveness approach.

参考文章(40)
Guangyu Zhu, Shuicheng Yan, Yi Ma, Image tag refinement towards low-rank, content-tag prior and error sparsity Proceedings of the international conference on Multimedia - MM '10. pp. 461- 470 ,(2010) , 10.1145/1873951.1874028
Zheng-Jun Zha, Meng Wang, Yan-Tao Zheng, Yi Yang, Richang Hong, Tat-Seng Chua, Interactive Video Indexing With Statistical Active Learning IEEE Transactions on Multimedia. ,vol. 14, pp. 17- 27 ,(2012) , 10.1109/TMM.2011.2174782
Shiai Zhu, Chong-Wah Ngo, Yu-Gang Jiang, Sampling and Ontologically Pooling Web Images for Visual Concept Learning IEEE Transactions on Multimedia. ,vol. 14, pp. 1068- 1078 ,(2012) , 10.1109/TMM.2012.2190387
Ken Chatfield, Victor Lempitsky, Andrea Vedaldi, Andrew Zisserman, The devil is in the details: an evaluation of recent feature encoding methods british machine vision conference. pp. 1- 12 ,(2011) , 10.5244/C.25.76
Jun Yang, Rong Yan, Alexander G. Hauptmann, Cross-domain video concept detection using adaptive svms Proceedings of the 15th international conference on Multimedia - MULTIMEDIA '07. pp. 188- 197 ,(2007) , 10.1145/1291233.1291276
Hao Xu, Jingdong Wang, Xian-Sheng Hua, Shipeng Li, Tag refinement by regularized LDA acm multimedia. pp. 573- 576 ,(2009) , 10.1145/1631272.1631359
Yang Yang, Yi Yang, Heng Tao Shen, Effective transfer tagging from image to video ACM Transactions on Multimedia Computing, Communications, and Applications. ,vol. 9, pp. 14- ,(2013) , 10.1145/2457450.2457456
Yang Yang, Yi Yang, Zi Huang, Heng Tao Shen, Feiping Nie, Tag localization with spatial correlations and joint group sparsity CVPR 2011. pp. 881- 888 ,(2011) , 10.1109/CVPR.2011.5995499
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, Yantao Zheng, NUS-WIDE Proceeding of the ACM International Conference on Image and Video Retrieval - CIVR '09. pp. 48- ,(2009) , 10.1145/1646396.1646452
Yangxi Li, Dacheng Tao, Meng Wang, Zheng-Jun Zha, Chao Xu, Bo Geng, Parallel Lasso for Large-Scale Video Concept Detection IEEE Transactions on Multimedia. ,vol. 14, pp. 55- 65 ,(2012) , 10.1109/TMM.2011.2174781