作者: Jean Rouat , Stephane Loiselle , Stephane Molotchnikoff
DOI: 10.1109/IROS.2011.6094672
关键词: Speaker recognition 、 Word recognition 、 Pattern recognition 、 Voice activity detection 、 Speech processing 、 Signal-to-noise ratio 、 Variable frame rate 、 Feature extraction 、 Speech recognition 、 Mel-frequency cepstrum 、 Hidden Markov model 、 Artificial intelligence 、 Computer science
摘要: A new bio-inspired speech analysis system that extracts acoustical events is proposed and used in the design of a variable frame rate (VFR) recognizer. The same recognizer (Hidden Markov Model -HMM- Mel Frequency Cepstrum Coefficients -MFCC-) has been with VFR conventional fixed (FFR) approach. In comparison other recognizers, hierarchical features have potential to serve as classification parameters complete recognition system. Also, no voice activity detection required there are hard decisions be taken by Events label identify moments at which properties stable or changing. These markers on an window can positioned perform recognition. Inspired our knowledge auditory visual systems, complex like transients energy orientation used. Training done clean noisy (from 20dB −10dB Signal Noise Ratios -SNR) reverberated using TI 46-word database corrupted 4 noises from Aurora 2 data. FFR recognizer, yields more than 50% increase rates for speaker independent isolated word task when SNRs between 0 20 dB.