Filling the Semantic Gap in Video Retrieval: An Exploration

作者: Alexander Hauptmann , Rong Yan , Wei-Hao Lin , Michael Christel , Howard Wactlar

DOI: 10.1007/978-1-84800-076-6_10

关键词:

摘要: Digital images and motion video have proliferated in the past few years, ranging from ever-growing personal photo collections to professional news documentary archives. In searching through these archives, digital imagery indexing based on low-level image features like colour texture, or manually entered text annotations, often fails meet user’s information need, i.e. there is a semantic gap produced by “the lack of coincidence between that one can extract visual data interpretation same for user given situation” (Smeulders, Worring, Santini, Gupta Jain 2000). The image/video analysis community has long struggled bridge this feature (colour histograms, shape) content description video. Early retrieval systems (Lew 2002; Smith, Lin, Naphade, Natsev Tseng 2002) usually modelled clips with set (low-level) detectable generated different modalities. It possible accurately automatically features, such as histograms HSV, RGB, YUV space, Gabor texture wavelets, structure edge direction maps. However, because meaning cannot be expressed way, had very restricted success approach queries. Several studies confirmed difficulty addressing needs (Markkula Sormunen 2000; Rodden, Basalaj, Sinclair Wood 2001). To overcome “semantic gap”, utilise intermediate textual descriptors reliably applied concepts (e.g. outdoors, faces, animals). Many researchers been developing automatic concept classifiers those related people (face, anchor, etc.), acoustic (speech, music, significant pause), objects (image blobs, buildings, graphics),

参考文章(47)
Michael G. Christel, Alexander G. Hauptmann, The Use and Utility of High-Level Semantic Features in Video Retrieval Lecture Notes in Computer Science. pp. 134- 144 ,(2005) , 10.1007/11526346_17
Shi-Yong Neo, Jin Zhao, Min-Yen Kan, Tat-Seng Chua, Video Retrieval Using High Level Features: Exploiting Query Matching and Confidence-Based Weighting Lecture Notes in Computer Science. pp. 143- 152 ,(2006) , 10.1007/11788034_15
Wessel Kraaij, Paul Over, Tzveta Ianeva, Alan F. Smeaton, TRECVID 2005 - An Overview TREC Video Retrieval Evaluation, TRECVID 2005, 14-15 November 2005, Gaithersburg, MD, USA. ,(2006)
Rong Yan, Alexander G. Hauptmann, Probabilistic models for combining diverse knowledge sources in multimedia retrieval Carnegie Mellon University. ,(2006)
Marjo Markkula, Eero Sormunen, End-User Searching Challenges Indexing Practices inthe Digital Newspaper Photo Archive Information Retrieval. ,vol. 1, pp. 259- 285 ,(2000) , 10.1023/A:1009995816485
Stephen L. Reed, Douglas B. Lenat, Mapping Ontologies into Cyc ,(2002)
Alexander G. Hauptmann, Towards a Large Scale Concept Ontology for Broadcast Video conference on image and video retrieval. pp. 674- 675 ,(2004) , 10.1007/978-3-540-27814-6_78
Alan F. Smeaton, Paul Over, TRECVID: benchmarking the effectiveness of information retrieval tasks on digital video conference on image and video retrieval. pp. 19- 27 ,(2003) , 10.1007/3-540-45113-7_3
Huan Wang, Song Liu, Liang-Tien Chia, Does ontology help in image retrieval? Proceedings of the 14th annual ACM international conference on Multimedia - MULTIMEDIA '06. pp. 109- 112 ,(2006) , 10.1145/1180639.1180672
Yonggang Qiu, Hans-Peter Frei, Concept based query expansion international acm sigir conference on research and development in information retrieval. pp. 160- 169 ,(1993) , 10.1145/160688.160713