作者: Ángel de la Torre , Javier Ramírez , M. Carmen Benítez , Antonio J. Rubio , José C. Segura
DOI:
关键词:
摘要: This paper shows an efficient voice activity detector (VAD) that is based on the estimation of long-term spectral diver- gence (LTSD) between noise and speech periods. The proposed method decomposes input signal into overlapped frames, uses a sliding window to compute spec- tral envelope measures speech/non-speech LTSD, thus yielding high discriminating decision rule minimizing average number errors. In order increase non- detection accuracy, threshold adapted measured energy while controlled hang-over ac- tivated only when observed signal-to-noise ratio (SNR) low. An exhaustive analysis VAD carried out using AURORA TIdigits SpeechDat-Car (SDC) databases. compared most com- monly used ones in field terms recognition performance. Experimental results demonstrate sustained advantage over G.729, AMR AFE VADs.