作者: Marzieh Razavi , Ramya Rasipuram , Mathew Magimai.-Doss
DOI: 10.1016/J.SPECOM.2016.03.003
关键词:
摘要: One of the primary steps in building automatic speech recognition (ASR) and text-to-speech systems is development a phonemic lexicon that provides mapping between each word its pronunciation as sequence phonemes. Phoneme lexicons can be developed by humans through use linguistic knowledge, however, this would costly time-consuming task. To facilitate process, grapheme-to phoneme conversion (G2P) techniques are used which, given an initial lexicon, relationship graphemes phonemes learned data-driven methods. This article presents novel G2P formalism which learns grapheme-to-phoneme acoustic data potentially relaxes need for target language. The involves training part followed inference part. In part, captured probabilistic lexical modeling framework. framework, hidden Markov model (HMM) trained HMM state representing grapheme parameterized categorical distribution Then orthographic transcription HMM, most probable inferred. article, we show recently proposed approach Kullback Leibler divergence-based (KL-HMM) framework particular case formalism. We then benchmark against two popular approaches, namely joint multigram decision tree-based approach. Our experimental studies on English French despite relatively poor performance at level, not significantly different than state-of-the-art methods ASR level. (C) 2016 Elsevier B.V. All rights reserved.