Robust voice activity detection based on noise eigenspace

作者: Dongwen Ying , Yu Shi , Xugang Lu , Jianwu Dang , Frank Soong

DOI: 10.1250/AST.28.413

关键词:

摘要: In this study, we propose a voice activity detector (VAD) based on noise eigenspace. which improve the robustness of VAD by utilizing compression capability A eigenspace is constructed by.using eigenvalue decomposition correlation matrix. When noisy speech projected into eigenspace, energy packed few dimensions with large eigenvalues, and those hopefully possess relatively less speech. because distribution usually different from distribution. The can be reduced discarding energy, while no significant loss occurs in To track variation, periodically updated, where computation cost for construction kept at an acceptable level. proposed was evaluated using TIMIT database mixed several noises. experiment showed that more accurate than previous VADs environments.

参考文章(24)
Jian-Lai Zhou, Yu Shi, Frank K. Soong, Auto-segmentation based VAD for robust ASR. conference of the international speech communication association. ,(2006)
Ruhi Sarikaya, John H. L. Hansen, Robust speech activity detection in the presence of noise. conference of the international speech communication association. ,(1998)
Qi Li, Jinsong Zheng, A. Tsai, Qiru Zhou, Robust endpoint detection and energy normalization for real-time speech and speaker recognition IEEE Transactions on Speech and Audio Processing. ,vol. 10, pp. 146- 157 ,(2002) , 10.1109/TSA.2002.1001979
Christophe Ris, Stéphane Dupont, Assessing local noise level estimation methods: application to noise robust ASR Speech Communication. ,vol. 34, pp. 141- 158 ,(2000) , 10.1016/S0167-6393(00)00051-0
P. Yip, K. Rao, Energy Packing Efficiency for the Generalized Discrete Transforms IEEE Transactions on Communications. ,vol. 26, pp. 1257- 1262 ,(1978) , 10.1109/TCOM.1978.1094199
Y. Ephraim, H.L. Van Trees, A signal subspace approach for speech enhancement IEEE Transactions on Speech and Audio Processing. ,vol. 3, pp. 251- 266 ,(1995) , 10.1109/89.397090
P.N. Garner, T. Fukada, Y. Komori, A differential spectral voice activity detector 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing. ,vol. 1, pp. 597- 600 ,(2004) , 10.1109/ICASSP.2004.1326056
Javier Ramı́rez, José C Segura, Carmen Benı́tez, Ángel de la Torre, Antonio Rubio, Efficient voice activity detection algorithms using long-term speech information Speech Communication. ,vol. 42, pp. 271- 287 ,(2004) , 10.1016/J.SPECOM.2003.10.002
J. Ramirez, J.C. Segura, C. Benitez, A. de la Torre, A. Rubio, An effective subband OSF-based VAD with noise reduction for robust speech recognition IEEE Transactions on Speech and Audio Processing. ,vol. 13, pp. 1119- 1129 ,(2005) , 10.1109/TSA.2005.853212