作者: M.H. Savoji
DOI: 10.1016/0167-6393(89)90067-8
关键词: Noise 、 Speech recognition 、 Context (language use) 、 Computer science 、 A priori and a posteriori 、 Zero-crossing rate 、 Voice activity detection 、 Energy (signal processing) 、 Background noise 、 Pattern recognition 、 Algorithm 、 Artificial intelligence 、 Feature (computer vision)
摘要: Abstract A robust new algorithm for accurate endpointing of speech signals is described in this paper after an overview the literature. This uses simple measures based on energy and zero-crossing rate speech/silence detection. Instead usual two-state model, three states including a transitory phase are assumed. The measure used special manner state to improve accuracy. classification context some knowledge-based heuristics correction false detections. approach here one visual detection waveform embedded noise. No priori knowledge noise needed capable producing good results even cases where signal starts with mouth One important feature its ease implementation real-time processing. also adaptive can cope varying background ratios. was originally developed database collected acoustic chamber. Modifications application telephony as well preliminary test included.