作者: E. Tsiang
DOI: 10.1109/ICASSP.1998.675463
关键词: Training set 、 Feature extraction 、 Context (language use) 、 Generalization 、 Computer science 、 Representation (mathematics) 、 Net (mathematics) 、 Artificial intelligence 、 Wavelet transform 、 Pattern recognition 、 Invariant (mathematics) 、 Set (abstract data type)
摘要: The proposed neural architecture consists of an analytic lower net, and a synthetic upper net. This paper focuses on the net performs 2D multiresolution wavelet decomposition initial spectral representation to yield multichannel local frequency modulations at multiple scales. From this representation, synthesizes increasingly complex features, resulting in set acoustic observables top layer with multiscale context dependence. also provides for invariance under shifts, dilatations tone intervals time intervals, by building these transformations into architecture. Application recognition gross fine phonetic categories from continuous speech diverse speakers shows that it high accuracy strong generalization modest amounts training data.