Use of higher level linguistic structure in acoustic modeling for speech recognition

作者: I. Shafran , M. Ostendorf

DOI: 10.1109/ICASSP.2000.859136

关键词:

摘要: Current speech recognition systems perform poorly on conversational as compared to read speech, largely because of the additional acoustic variability observed in speech. Our hypothesis is that there are systematic effects, related higher level structures, not being captured current models. In this paper we describe a method extend standard clustering incorporate such features estimating We report improvements obtained Switchboard task over triphones and pentaphones by use word- syllable-level features. addition, preliminary studies with prosodic information.

参考文章(0)