作者: Mats Blomberg
DOI:
关键词:
摘要: In the current work, instantaneous adaptation in speech recognition is performedby estimating speaker properties, which modify original trained acousticmodels. We introduce a new property, size of model space, isincluded to previously used features, VTLN and spectral slope. These arejointly estimated for each test utterance. The feature has shown be effectivefor children’s using adult-trained models TIDIGITS.Adding lowered error rate by around 10% relative. overallcombination VTLN, slope space scaling represents asubstantial 31% relative reduction compared with single VTLN. There was noimprovement among adult speakers TIDIGITS TIMIT. Improvement forthis category expected when training sets are recorded indifferent conditions, such as read spontaneous speech.