Detection of human speech in structured noise

作者: J.D. Hoyt , H. Wechsler

DOI: 10.1109/ICASSP.1994.389676

关键词: Feature vectorNoise shapingWord error rateSpeech processingSpeech enhancementArtificial intelligenceComputer scienceSpeech recognitionNoiseAcoustic testingPattern recognitionFormant

摘要: This paper describes research to develop an efficient system that provides a binary decision as the presence of speech in short (one three second) time sample acoustic signal. A method which is and reliably detects human structured noise (such wind, music, traffic sounds, etc.) described. Two separate algorithms were developed. The first algorithm by testing for concave and/or convex formant shapes. second statistical pattern classifier utilizing radial basis function (RBF) networks with mel-cepstra feature vectors. Classification errors are not consistent across these two different methods. As consequence, we plan reduce our error rate fusion >

参考文章(8)
John G. Proakis, John R. Deller, John H. Hansen, Discrete-Time Processing of Speech Signals ,(1993)
Alan V. Oppenheim, Ronald W. Schafer, Discrete-Time Signal Processing ,(1989)
J. Yang, Frequency domain noise suppression approaches in mobile telephone systems IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 2, pp. 363- 366 ,(1993) , 10.1109/ICASSP.1993.319313
Lawrence R. Rabiner, Ronald W. Schafer, Digital Processing of Speech Signals ,(1978)
D.R. Hush, B.G. Horne, Progress in supervised neural networks IEEE Signal Processing Magazine. ,vol. 10, pp. 8- 39 ,(1993) , 10.1109/79.180705
Kenney Ng, Richard P Lippmann, A Comparative Study of the Practical Characteristics of Neural Network and Conventional Pattern Classifiers neural information processing systems. ,vol. 3, pp. 970- 976 ,(1990)
Wai-Yip Chan, David D. Falconer, SPEECH DETECTION FOR A VOICE/DATA MOBILE RADIO TERMINAL. ,(1983)
Brian C. J. Moore, An introduction to the psychology of hearing, 3rd ed. San Diego, CA, US: Academic Press. ,(1989)