Fuzzy Logic Speech/Non-speech Discrimination for Noise Robust Speech Processing

作者: Rafael Culebras , Javier Ramírez , Juan Manuel Górriz , José C Segura

DOI: 10.1007/11758501_55

关键词:

摘要: This paper shows a fuzzy logic speech/non-speech discrimination method for improving the performance of speech processing systems working in noise environments. The system is based on Sugeno inference engine with membership functions defined as combination two Gaussian functions. rule base consists ten if then statements terms denoised subband signal-to-noise ratios (SNRs) and zero crossing rates (ZCRs). Its operation optimized by means hybrid training algorithm combining least-squares backpropagation gradient descent function parameters. experiments conducted Spanish SpeechDat-Car database that proposed yields clear improvements over set standardized VADs discontinuous transmission (DTX) distributed recognition (DSR) also recently published VAD methods.

参考文章(22)
Ángel de la Torre, Javier Ramírez, M. Carmen Benítez, Antonio J. Rubio, José C. Segura, A New Adaptive Long-Term Spectral Estimation Voice Activity Detector conference of the international speech communication association. ,(2003)
Piergiorgio Svaizer, Maurizio Omologo, Luca Armani, Marco Matassoni, Use of a CSP-based voice activity detector for distant-talking ASR. conference of the international speech communication association. ,(2003)
Qi Li, Jinsong Zheng, A. Tsai, Qiru Zhou, Robust endpoint detection and energy normalization for real-time speech and speaker recognition IEEE Transactions on Speech and Audio Processing. ,vol. 10, pp. 146- 157 ,(2002) , 10.1109/TSA.2002.1001979
Régine Le Bouquin-Jeannès, Gérard Faucon, Study of a voice activity detector and its influence on a noise reduction system Speech Communication. ,vol. 16, pp. 245- 254 ,(1995) , 10.1016/0167-6393(94)00056-G
Lamia Karray, Arnaud Martin, Towards improving speech detection robustness for speech recognition in adverse conditions Speech Communication. ,vol. 40, pp. 261- 276 ,(2003) , 10.1016/S0167-6393(02)00066-3
J.-S.R. Jang, ANFIS: adaptive-network-based fuzzy inference system systems man and cybernetics. ,vol. 23, pp. 665- 685 ,(1993) , 10.1109/21.256541
Kyoung-Ho Woo, Tae-Young Yang, Kun-Jung Park, Chungyong Lee, Robust voice activity detection algorithm for estimating noise spectrum Electronics Letters. ,vol. 36, pp. 180- 181 ,(2000) , 10.1049/EL:20000192
J.M. Mendel, Fuzzy logic systems for engineering: a tutorial Proceedings of the IEEE. ,vol. 83, pp. 345- 377 ,(1995) , 10.1109/5.364485
J.M. Górriz, J. Ramírez, J.C. Segura, C.G. Puntonet, Improved MO-LRT VAD based on bispectra Gaussian model Electronics Letters. ,vol. 41, pp. 877- 879 ,(2005) , 10.1049/EL:20051761
J. Ramirez, J.C. Segura, C. Benitez, L. Garcia, A. Rubio, Statistical voice activity detection using a multiple observation likelihood ratio test IEEE Signal Processing Letters. ,vol. 12, pp. 689- 692 ,(2005) , 10.1109/LSP.2005.855551