A robust algorithm for accurate endpointing of speech signals

作者: M.H. Savoji

DOI: 10.1016/0167-6393(89)90067-8

关键词: NoiseSpeech recognitionContext (language use)Computer scienceA priori and a posterioriZero-crossing rateVoice activity detectionEnergy (signal processing)Background noisePattern recognitionAlgorithmArtificial intelligenceFeature (computer vision)

摘要: Abstract A robust new algorithm for accurate endpointing of speech signals is described in this paper after an overview the literature. This uses simple measures based on energy and zero-crossing rate speech/silence detection. Instead usual two-state model, three states including a transitory phase are assumed. The measure used special manner state to improve accuracy. classification context some knowledge-based heuristics correction false detections. approach here one visual detection waveform embedded noise. No priori knowledge noise needed capable producing good results even cases where signal starts with mouth One important feature its ease implementation real-time processing. also adaptive can cope varying background ratios. was originally developed database collected acoustic chamber. Modifications application telephony as well preliminary test included.

参考文章(14)
V. Sarma, D. Venugopal, Studies on pattern recognition approach to voiced-unvoiced-silence classification ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing. ,vol. 3, pp. 1- 4 ,(1978) , 10.1109/ICASSP.1978.1170438
Chieh Tsao, R. Gray, An endpoint detector for LPC speech using residual error look-ahead for vector quantization applications international conference on acoustics, speech, and signal processing. ,vol. 9, pp. 97- 100 ,(1984) , 10.1109/ICASSP.1984.1172658
J. G. Wilpon, L R. Rabiner, T. Martin, An Improved Word-Detection Algorithm for Telephone-Quality Speech Incorporating Both Syntactic and Semantic Constraints AT&T Bell Laboratories Technical Journal. ,vol. 63, pp. 479- 498 ,(1984) , 10.1002/J.1538-7305.1984.TB00016.X
L. R. Rabiner, M. R. Sambur, An Algorithm for Determining the Endpoints of Isolated Utterances Bell System Technical Journal. ,vol. 54, pp. 297- 315 ,(1975) , 10.1002/J.1538-7305.1975.TB02840.X
L. Lamel, L. Rabiner, A. Rosenberg, J. Wilpon, An improved endpoint detector for isolated word recognition IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 29, pp. 777- 785 ,(1981) , 10.1109/TASSP.1981.1163642
V. Ramamoorthy, Voice/Unvoice detection based on a composite-Gaussian source model of speech international conference on acoustics, speech, and signal processing. ,vol. 5, pp. 57- 60 ,(1980) , 10.1109/ICASSP.1980.1171014
M. DonVito, B. Schoenherr, Subband coding with silence detection international conference on acoustics, speech, and signal processing. ,vol. 10, pp. 1433- 1436 ,(1985) , 10.1109/ICASSP.1985.1168092
P. De Souza, A statistical approach to the design of an adaptive self-normalizing silence detector IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 31, pp. 678- 684 ,(1983) , 10.1109/TASSP.1983.1164129
B. Atal, L. Rabiner, A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 24, pp. 201- 212 ,(1976) , 10.1109/TASSP.1976.1162800
J. Lynch, J. Josenhans, R. Crochiere, Speech/Silence segmentation for real-time coding via rule based adaptive endpoint detection international conference on acoustics, speech, and signal processing. ,vol. 12, pp. 1348- 1351 ,(1987) , 10.1109/ICASSP.1987.1169516