Automatic adaptive speech separation using beamformer-output-ratio for voice activity classification

作者： Thuy Ngoc Tran , William Cowley , André Pollok

DOI: 10.1016/J.SIGPRO.2015.01.015

关键词:

摘要: This paper focuses on the practical challenge of adaptation control for speech separation systems. Adaptive beamforming methods, such as minimum variance distortionless response (MDVR), can effectively extract desired signal from interference and noise. However, to avoid cancellation problem, beamformer is halted when speaker active. An automated scheme this requires classifying speakers' voice activity status, which remains a multi-speaker environments. In paper, we propose novel approach identify activities two speakers based new metric, called beamformer-output-ratio (BOR). Statistical properties BOR are studied used develop hypothesis-based method classification. The further refined using an algorithm detecting incorrect by analysing changes in output power blind adapting MVDR beamformer. Based construct automatic adaptive system simultaneously separate speakers. module uses beamformers whose guided Our methods lead to, some cases, 20% reduction classification error, 8dB improvement SINR. results verified both synthesised signals realistic recordings. HighlightsWe design speakers.The quantity its roles active identification introduced.The BOR-VAC developed, generic form realisation.We model behaviour detect adaptation.The proposed systems tested real

参考文章(36)

Ulrik Kjems, Michael Syskind Pedersen, Lucas C. Parra, Jan Larsen, A SURVEY OF CONVOLUTIVE BLIND SOURCE SEPARATION METHODS ,(2007)

John McDonough, Matthias Woelfel, Distant Speech Recognition ,(2009)

Joseph Hector Dibiase, A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays Ph.D. Thesis. pp. 4877- ,(2000)

DeLiang Wang, On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis Speech Separation by Humans and Machines. pp. 181- 197 ,(2005) , 10.1007/0-387-22794-6_12

Wei Zhang, S. Gazor, Statistical modelling of speech signals international conference on signal processing. ,vol. 1, pp. 480- 483 ,(2002) , 10.1109/ICOSP.2002.1181096

Dimitris G. Manolakis, Vinay K. Ingle, Stephen M. Kogon, Statistical and Adaptive Signal Processing: Spectral Estimation, Signal Modeling, Adaptive Filtering and Array Processing ,(1999)

Javier Ramirez, Juan Manuel Górriz, José Carlos Segura, Voice Activity Detection. Fundamentals and Speech Recognition System Robustness InTech. ,(2007) , 10.5772/4740

Ivan Himawan, Iain McCowan, Mike Lincoln, Microphone Array Beamforming Approach to Blind Speech Separation Machine Learning for Multimodal Interaction. pp. 295- 305 ,(2007) , 10.1007/978-3-540-78155-4_26

Alan V. Oppenheim, Ronald W. Schafer, Discrete-Time Signal Processing ,(1989)

10.

Sailes K. Sengijpta, Fundamentals of Statistical Signal Processing: Estimation Theory Technometrics. ,vol. 37, pp. 465- 466 ,(1995) , 10.1080/00401706.1995.10484391

Automatic adaptive speech separation using beamformer-output-ratio for voice activity classification

来源期刊

我的账户

Automatic adaptive speech separation using beamformer-output-ratio for voice activity classification

来源期刊

相似文章 5

Low velocity impact localization system using FBG array and MVDR beamforming algorithm

A Review of Multimodal Interaction

Time–frequency localized three-band biorthogonal wavelet filter bank using semidefinite relaxation and nonlinear least squares with epileptic seizure EEG signal classification

Deep Transductive Nonnegative Matrix Factorization for Speech Separation

Robust adaptive beamforming for MIMO radar in the presence of covariance matrix estimation error and desired signal steering vector mismatch

我的账户