TRECVID 2004 Search and Feature Extraction Task by NUS PRIS

作者: Shi-Yong Neo , Sheng Gao , Qi Tian , Rui Shi , Tat-Seng Chua

DOI:

关键词:

摘要: This paper describes the details of our systems for feature extraction and search tasks TRECVID-2004. For extraction, we emphasize use visual auto-concept annotation technique, with fusion text specialized detectors, to induce concepts in videos. task, emphasis is two-fold. First employ query-specific models, second, multi-modality features, including text, annotated concepts, OCR output, shot classes detectors perform search. Our pipeline similar that employed text-based definition question-answering approaches. We first query analysis categorize into categories of: {PERSON, SPORTS, FINANCE, WEATHER, DISASTER GENERAL}. From these categories, a number constraints on process, including: (a) type features or emphasize; (b) key concept terms use; (c) video classes, such as sports anchor person etc exclude results. The results 60 hours test from TRECVID 2004 evaluation demonstrate approaches are effective.

参考文章(15)
Patrick Hanks, Kenneth Ward Church, Word association norms, mutual information, and lexicography Computational Linguistics. ,vol. 16, pp. 22- 29 ,(1990) , 10.5555/89086.89095
Rui Shi, Huamin Feng, Tat-Seng Chua, Chin-Hui Lee, An Adaptive Image Content Representation and Segmentation Approach to Automatic Image Annotation conference on image and video retrieval. pp. 545- 554 ,(2004) , 10.1007/978-3-540-27814-6_64
Hui Yang, Tat-Seng Chua, Shuguang Wang, Chun-Keat Koh, Structured use of external knowledge for event-based open domain question answering international acm sigir conference on research and development in information retrieval. pp. 33- 40 ,(2003) , 10.1145/860435.860444
Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504
Jean-Luc Gauvain, Lori Lamel, Gilles Adda, The LIMSI Broadcast News transcription system Speech Communication. ,vol. 37, pp. 89- 108 ,(2002) , 10.1016/S0167-6393(01)00061-9
Liping Chen, Tat-Sheng Chua, A match and tiling approach to content-based video retrieval international conference on multimedia and expo. pp. 301- 304 ,(2001) , 10.1109/ICME.2001.1237716
Adam Kilgarriff, Christiane Fellbaum, WordNet : an electronic lexical database Language. ,vol. 76, pp. 706- ,(2000) , 10.2307/417141
Toshio Sato, Takeo Kanade, Ellen K. Hughes, Michael A. Smith, Shin'ichi Satoh, Video OCR: indexing digital new libraries by recognition of superimposed captions Multimedia Systems. ,vol. 7, pp. 385- 395 ,(1999) , 10.1007/S005300050140
Rong Yan, Jun Yang, Alexander G. Hauptmann, Learning query-class dependent weights in automatic video retrieval acm multimedia. pp. 548- 555 ,(2004) , 10.1145/1027527.1027661
Ming-yu Chen, A. Hauptmann, Searching for a specific person in broadcast news video international conference on acoustics, speech, and signal processing. ,vol. 3, pp. 1036- 1039 ,(2004) , 10.1109/ICASSP.2004.1326725