作者: Ajay Divakaran , Regunathan Radhakrishnan
DOI:
关键词: Audio signal 、 Identification (information) 、 Segmentation 、 Audio analyzer 、 Histogram 、 Cluster analysis 、 Computer science 、 Joint (audio engineering) 、 Hidden Markov model 、 Speech recognition
摘要: A method segments and summarizes a news video using both audio visual features extracted from the video. The summaries can be used to quickly browse locate topics of interest. generalized sound recognition hidden Markov model (HMM) framework for joint segmentation classification signal is used. HMM not only provides label segment, but also compact state duration histogram descriptors. Using these descriptors, contiguous male female speech are clustered detect different presenters in Second level clustering performed motion activity color establish correspondences between distinct speaker clusters obtained analysis. Presenters then identified as those that either occupy significant period time, or appear at times throughout Identification marks beginning ending semantic boundaries. boundaries generate hierarchical summary fast browsing.