作者: James Glass , Issam Bazzi
DOI:
关键词:
摘要: This thesis concerns the problem of unknown or out-of-vocabulary (OOV) words in continuous speech recognition. We propose a novel approach for handling OOV within single-stage recognition framework. To achieve this goal, an explicit and detailed model is constructed then used to augment closed-vocabulary search space standard recognizer. achieves open-vocabulary through use more flexible subword units that can be concatenated during form new phone sequences corresponding potential words. Examples such are phones, syllables, some automatically-learned multi-phone sequences. Subword have attractive property being closed set, thus able cover any words, conceivably most utterances with partially spoken as well. The main challenge ensuring does not absorb portions signal in-vocabulary (IV) In dealing challenge, we explore several research issues related designing lexicon, language model, topology model. We present dictionary-based estimating models. Such models utilized help recognize underlying phonetic transcription also data-driven iterative bottom-up procedure automatically creating inventory. Starting individual uses maximum mutual information principle successively merge phones obtain longer units. extends modelling multiple classes In addition, examines combining confidence scoring. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)