作者: Francis Kubala , Daben Liu
DOI:
关键词: Transcription (software) 、 Phone 、 Speech recognition 、 Search engine indexing 、 Change detection 、 Computer science 、 Multimedia
摘要: In this paper, we describe a new speaker change detection algorithm designed for fast transcription and audio indexing of spoken broadcast news. We have two-stage that begins with gender-independent phone-class recognition pass. collapse the phoneme inventory to only 4 broad classes include different models non-speech, resulting in small decoder runs less than 0.1 times real-time. The second stage SCD hypothesizes boundary between every phone labeled input. level time resolution our approach permits run quickly while maintaining same accuracy as frame approach. Applying algorithms large sample news programs resulted improvements accuracy, speech speed.