作者: Malur K. Sundareshan , Pablo Zegers
DOI:
关键词: Acoustic model 、 Neural gas 、 Feature vector 、 Time delay neural network 、 Artificial intelligence 、 Recurrent neural network 、 Computer science 、 Speaker recognition 、 Cepstrum 、 Speech recognition 、 Feature (machine learning) 、 Pattern recognition
摘要: Although speech recognition products are already available in the market at present, their development is mainly based on statistical techniques which work under very specific assumptions. The presented this thesis investigates feasibility of alternative approaches for solving problem more efficiently. A recognizer system comprised two distinct blocks, a Feature Extractor and Recognizer, presented. block uses standard LPC Cepstrum coder, translates incoming into trajectory feature space, followed by Self Organizing Map, tailors outcome coder order to produce optimal representations words reduced dimension spaces. Designs Recognizer blocks three different compared. performance Templates, MultiLayer Perceptrons, Recurrent Neural Networks recognizers tested small isolated speaker dependent word problem. Experimental results indicate that trajectories such spaces can provide reliable spoken words, while reducing training complexity operation Recognizer. comparison between design Recognizers conducted here gives better understanding its possible solutions. new learning procedure optimizes usage set also Optimal tailoring trajectories,