Transfer tagging from image to video

作者: Yang Yang , Yi Yang , Zi Huang , Heng Tao Shen

DOI: 10.1145/2072298.2071958

关键词:

摘要: Nowadays massive amount of web video datum has been emerging on the Internet. To achieve an effective and efficient retrieval, it is critical to automatically assign semantic keywords videos via content analysis. However, most existing tagging methods suffer from problem lacking sufficient tagged training due high labor cost manual tagging. Inspired by observation that there are much more well-labeled data in other yet relevant types media (e.g. images), this paper we study how build a "cross-media tunnel" transfer external tag knowledge image video. Meanwhile, intrinsic structures both spaces well explored for inferring tags. We propose Cross-Media Tag Transfer (CMTT) paradigm which able to: 1) between minimizing their distribution difference; 2) infer tags revealing underlying manifold embedded within spaces. also learn explicit mapping function handle unseen videos. Experimental results have reported analyzed illustrate superiority our proposal.

参考文章(10)
Jun Yang, Rong Yan, Alexander G. Hauptmann, Cross-domain video concept detection using adaptive svms Proceedings of the 15th international conference on Multimedia - MULTIMEDIA '07. pp. 188- 197 ,(2007) , 10.1145/1291233.1291276
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, Yantao Zheng, NUS-WIDE Proceeding of the ACM International Conference on Image and Video Retrieval - CIVR '09. pp. 48- ,(2009) , 10.1145/1646396.1646452
Timo Ojala, Matti Pietikäinen, David Harwood, A comparative study of texture measures with classification based on featured distributions Pattern Recognition. ,vol. 29, pp. 51- 59 ,(1996) , 10.1016/0031-3203(95)00067-4
Alexander Loui, Jiebo Luo, Shih-Fu Chang, Dan Ellis, Wei Jiang, Lyndon Kennedy, Keansub Lee, Akira Yanagawa, Kodak's consumer video benchmark data set Proceedings of the international workshop on Workshop on multimedia information retrieval - MIR '07. pp. 245- 254 ,(2007) , 10.1145/1290082.1290117
Lixin Duan, Dong Xu, Ivor Wai-Hung Tsang, Jiebo Luo, Visual event recognition in videos by learning from web data computer vision and pattern recognition. ,vol. 34, pp. 1667- 1680 ,(2010) , 10.1109/TPAMI.2011.265
Yi Yang, Yue-Ting Zhuang, Fei Wu, Yun-He Pan, Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval IEEE Transactions on Multimedia. ,vol. 10, pp. 437- 446 ,(2008) , 10.1109/TMM.2008.917359
Feiping Nie, Dong Xu, Ivor Wai-Hung Tsang, Changshui Zhang, Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction IEEE Transactions on Image Processing. ,vol. 19, pp. 1921- 1932 ,(2010) , 10.1109/TIP.2010.2044958
Xiaojin Zhu, Semi-Supervised Learning Literature Survey University of Wisconsin-Madison Department of Computer Sciences. ,(2005)
Mark J. Huiskes, Michael S. Lew, The MIR flickr retrieval evaluation Proceeding of the 1st ACM international conference on Multimedia information retrieval - MIR '08. pp. 39- 43 ,(2008) , 10.1145/1460096.1460104
K. M. Borgwardt, A. Gretton, M. J. Rasch, H.-P. Kriegel, B. Scholkopf, A. J. Smola, Integrating structured biological data by Kernel Maximum Mean Discrepancy intelligent systems in molecular biology. ,vol. 22, pp. 49- 57 ,(2006) , 10.1093/BIOINFORMATICS/BTL242