TITGT at TRECVID 2009 workshop

作者: Nakamasa Inoue , Shanshan Hao , Koichi Shinoda , Tatsuhiko Saito , Chin-Hui Lee

DOI:

关键词:

摘要: First, we extract SIFT features from all the image frames in each shot. This multi-frame technique is expected to perform well especially when objects are taken different angles. Then, model extracted shot by a GMM. We call resulting GMMs GMMs. They be more robust against quantization errors that occur hard-assignment clustering Bag-of-Keypoints approach. Furthermore, they also have variance information of features. The expectation-maximization (EM) algorithm often used estimate parameters However, there may not enough precisely parameters. Hence, GMM using maximum posteriori (MAP) adaptation which priori distribution estimated videos. classify shots support vector machines (SVMs) with radial basis function (RBF) kernel, where distance between defined as weighted sum Mahalanobis distances corresponding mixture components. 2. Acoustic As acoustic features, mel-frequency cepstrum coefficients (MFCCs), widely speech recognition. HLF an ergodic hidden Markov (HMM). make HMM for HLFs universal background (UBM) and use likelihood ratio target UBM detection.

参考文章(8)
Sadaoki Furui, Shanshan Hao, Koichi Shinoda, Koji Yamasaki, Yusuke Yoshizawa, Tokyo Tech at TRECVID 2008. TRECVID. ,(2008)
Pierre-Claude-François Daunou, Mémoire sur les élections au scrutin Baudouin. ,(1803)
De-hong Wang, Sheng Gao, Qi Tian, Wing-kin Sung, Discriminative Fusion Approach for Automatic Image Annotation multimedia signal processing. pp. 1- 4 ,(2005) , 10.1109/MMSP.2005.248595
Sheng Gao, Wen Wu, Chin-Hui Lee, Tat-Seng Chua, A maximal figure-of-merit learning approach to text categorization international acm sigir conference on research and development in information retrieval. pp. 174- 181 ,(2003) , 10.1145/860435.860469
J.R. Bellegarda, Exploiting latent semantic information in statistical language modeling Proceedings of the IEEE. ,vol. 88, pp. 1279- 1296 ,(2000) , 10.1109/5.880084
Ilseo Kim, Chin-Hui Lee, A hierarchical grid feature representation framework for automatic image annotation international conference on acoustics, speech, and signal processing. pp. 1125- 1128 ,(2009) , 10.1109/ICASSP.2009.4959786
Byungki Byun, Chengyuan Ma, Chin-Hui Lee, An experimental study on discriminative concept classifier combination for TRECVID high-level feature extraction international conference on image processing. pp. 2532- 2535 ,(2008) , 10.1109/ICIP.2008.4712309