System and method for automatic audio content analysis for word spotting, indexing, classification and retrieval

作者: Dragutin Petkovic , Savitha Srinivasan , Dulce Beatriz Ponceleon

DOI:

关键词:

摘要: A system and method for indexing an audio stream subsequent information retrieval skimming, gisting, summarizing the includes using special prefiltering such that only relevant speech segments are generated by a recognition engine indexed. Specific features disclosed improve precision recall of used after word spotting. The invention rendering into intervals, with each interval including one or more segments. For segment it is determined whether exhibits predetermined as particular range zero crossing rates, energy, spectral energy concentration. heuristically to represent respective events silence, music, speech, on music. Also, group intervals matches predefined meta pattern continuous uninterrupted concluding ideas, hesitations emphasis in so on, then indexed based classification matching, being retrieval. alternatives longer terms along weights, recall.

参考文章(20)
Francine R. Chen, Lynn D. Wilcox, Philip A. Chou, Alex D. Poon, Vijay Balasubramanian, Donald G. Kimber, Karon A. Weber, Real-time audio recording system for automatic speaker indexing ,(1994)
Steven P. Russell, Michael V. McCusker, Method and apparatus for managing information ,(1994)
Francine R. Chen, Lynn D. Wilcox, Philip A. Chou, Alex D. Poon, Vijay Balasubramanian, Donald G. Kimber, Karon A. Weber, Segmentation of audio data for indexing of conversational speech for real-time or postprocessing applications ,(1994)
Barry Arons, SpeechSkimmer ACM Transactions on Computer-Human Interaction. ,vol. 4, pp. 3- 38 ,(1997) , 10.1145/244754.244758
Peter F. Brown, Speech recognition system for natural language translation Journal of the Acoustical Society of America. ,vol. 97, pp. 1365- 1365 ,(1993) , 10.1121/1.412155