作者: A. Waibel , T. Hanazawa , G. Hinton , K. Shikano , K.J. Lang
DOI: 10.1109/29.21701
关键词:
摘要: The authors present a time-delay neural network (TDNN) approach to phoneme recognition which is characterized by two important properties: (1) using three-layer arrangement of simple computing units, hierarchy can be constructed that allows for the formation arbitrary nonlinear decision surfaces, TDNN learns automatically error backpropagation; and (2) enables discover acoustic-phonetic features temporal relationships between them independently position in time therefore not blurred shifts input. As task, speaker-dependent phonemes B, D, G varying phonetic contexts was chosen. For comparison, several discrete hidden Markov models (HMM) were trained perform same task. Performance evaluation over 1946 testing tokens from three speakers showed achieves rate 98.5% correct while obtained best HMMs only 93.7%. >