作者: S. Renals , D. McKelvie , F. McInnes
DOI: 10.1109/ICASSP.1991.150353
关键词:
摘要: The recognition performances of two front ends are compared for continuous speech tasks. First, a neural network model (NNM) end was used, with frame labeling performed by radial basis function and segmentation Viterbi algorithm. second discrete hidden Markov (HMM), featuring explicit state duration probability distributions. Two experiments were performed. first used speaker-dependent database, lexicon 571 words. Using low-perplexity grammar, the NNM produced word accuracy 94% sentence 86%. This slightly inferior to HMM end, which accuracies 96% 88%. Without 58% 49% (HMM) recorded. set MIT portion TIMIT database (415 speakers 2072 sentences in total). Results poor both ends, producing marginally better results. >