作者: A. Facco , D. Falavigna , R. Gretter , M. Viganò
DOI: 10.1016/J.SPECOM.2005.07.004
关键词:
摘要: Abstract This paper describes the specification, design and development phases of two widely used telephone services based on automatic speech recognition. The effort spent for evaluating tuning these will be discussed in detail. In developing first service, mainly recognition “alphanumeric” sequences, a significant part work consisted refining acoustic models. To increase accuracy we adopted algorithms methods consolidated past over broadcast news transcription tasks. A result shows that use task specific context dependent phone models reduces word error rate by about 40% relative to using independent Note latter was achieved small vocabulary task, significantly different from those generally transcription. We also investigated both unsupervised supervised training procedures. Moreover, studied novel partly technique allows us select some “optimal” way material manually transcribe model training. proposed procedure gives performance close obtained with completely method. second phrase spotting, wide devoted language refinement. particular, several types rejection networks were detect out words given task; major demonstrates class trigram 36.7% 11.1% respect loop network. For benefits related costs brought regular grammars, stochastic mixed reported discussed. Finally, notice most experiments described this carried field databases collected through developed services.