Audio-Assisted segmentation and browsing of news videos

作者: Ajay Divakaran , Regunathan Radhakrishnan

DOI:

关键词: Audio signalIdentification (information)SegmentationAudio analyzerHistogramCluster analysisComputer scienceJoint (audio engineering)Hidden Markov modelSpeech recognition

摘要: A method segments and summarizes a news video using both audio visual features extracted from the video. The summaries can be used to quickly browse locate topics of interest. generalized sound recognition hidden Markov model (HMM) framework for joint segmentation classification signal is used. HMM not only provides label segment, but also compact state duration histogram descriptors. Using these descriptors, contiguous male female speech are clustered detect different presenters in Second level clustering performed motion activity color establish correspondences between distinct speaker clusters obtained analysis. Presenters then identified as those that either occupy significant period time, or appear at times throughout Identification marks beginning ending semantic boundaries. boundaries generate hierarchical summary fast browsing.

参考文章(8)
David Crawford Gibbon, Zhu Liu, Qian Huang, Aaron Edward Rosenberg, Behzad Shahraray, System and method for automated multimedia content indexing and retrieval ,(2003)
Chalapathy Venkata Neti, Benoit Emmanuel Ghislain Maison, Sankar Basu, Stephane Herman Maes, Homayoon S. M. Beigi, Andrew William Senior, Methods and apparatus for audio-visual speaker recognition and utterance verification ,(1999)
Homayoon Sadr Mohammed Beigi, Methods and apparatus for concurrent speech recognition, speaker segmentation and speaker classification Journal of the Acoustical Society of America. ,vol. 113, pp. 2392- 2392 ,(1999) , 10.1121/1.1584164