作者: Evaggelos Spyrou , Giorgos Tolias , Phivos Mylonas , Yannis Avrithis
DOI: 10.1007/S11042-008-0237-9
关键词: Artificial intelligence 、 Representation (mathematics) 、 Cluster analysis 、 Thesaurus (information retrieval) 、 Computer vision 、 Basis (linear algebra) 、 Selection (linguistics) 、 Latent semantic analysis 、 Frame (networking) 、 Sequence 、 Computer science
摘要: This paper presents a video analysis approach based on concept detection and keyframe extraction employing visual thesaurus representation. Color texture descriptors are extracted from coarse regions of each frame is constructed after clustering regions. The clusters, called region types, used as basis for representing local material information through the construction model vector frame, which reflects composition image in terms types. Model representation selection either shot or across an entire sequence. process ensures that all types represented. A number high-level detectors then trained using global annotation Latent Semantic Analysis applied. To enhance performance per shot, employed selected keyframes framework proposed working very large data sets.