Acoustic model clustering based on syllable structure

关键词: Computer science 、 Speech recognition 、 Word (computer architecture) 、 Syllable 、 Artificial intelligence 、 Tree (data structure) 、 Natural language processing 、 Cluster analysis 、 Variation (linguistics) 、 Context (language use) 、 Syllabic verse 、 Acoustic model

摘要: Current speech recognition systems perform poorly on conversational as compared to read speech, arguably due the large acoustic variability inherent in speech. Our hypothesis is that there are systematic effects local context, associated with syllabic structure, not being captured current models. Such variation may be modeled using a broader definition of context than traditional which restrict neighboring phonemes. In this paper, we study use word- and syllable-level conditioning recognizing We describe method extend standard tree-based clustering incorporate number features, report results Switchboard task indicate syllable structure outperforms pentaphones incurs less computational cost. It has been hypothesized previous work models for English was limited because ignoring phenomenon resyllabification (change at word boundaries), but our analysis shows accounting does impact performance.

uni-trier.de 本地加速

sciencedirect.com 本地加速

doi.org 本地加速

elsevier.com 本地加速

sciencedirect.com LINK 下载加速

sci-hub.se PDF 下载加速

参考文章(2)

Mari Ostendorf, Richard Wright, Izhak Shafran, Prosody and phonetic variability: Lessons learned from acoustic model clustering ,(2003)

J.J. Godfrey, E.C. Holliman, J. McDaniel, SWITCHBOARD: telephone speech corpus for research and development international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 517- 520 ,(1992) , 10.1109/ICASSP.1992.225858

Acoustic model clustering based on syllable structure

来源期刊

我的账户

Acoustic model clustering based on syllable structure

来源期刊

相似文章 0

我的账户