Variational EM for binaural sound-source separation and localization

作者: Antoine Deleforge , Florence Forbes , Radu Horaud

DOI: 10.1109/ICASSP.2013.6637612

关键词:

摘要: The sound-source separation and localization (SSL) problems are addressed within a unified formulation. Firstly, mapping between white-noise source locations binaural cues is estimated. Secondly, SSL solved via Bayesian inversion of this in the presence multiple sparse-spectrum emitters (such as speech), noise reverberations. We propose variational EM algorithm which described detail together with initialization convergence issues. Extensive real-data experiments show that method outperforms state-of-the-art both (azimuth elevation).

参考文章(21)
Antoine Deleforge, Radu Horaud, A latently constrained mixture model for audio source separation and localization international conference on latent variable analysis and signal separation. ,vol. 7191, pp. 372- 379 ,(2012) , 10.1007/978-3-642-28551-6_46
JM Bernardo, MJ Bayarri, JO Berger, AP Dawid, D Heckerman, AFM Smith, M West, The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures Oxford University Press. ,(2003)
Sylvain Marchand, Joan Mouba, A Source Localization/Separation/Respatialization System Based on Unsupervised Classification of Interaural Cues Proceedings of the Digital Audio Effects (DAFx06) Conference. ,(2006)
O. Yilmaz, S. Rickard, Blind separation of speech mixtures via time-frequency masking IEEE Transactions on Signal Processing. ,vol. 52, pp. 1830- 1847 ,(2004) , 10.1109/TSP.2004.828896
Fakheredine Keyrouz, Werner Maier, Klaus Diepold, Robotic Localization and Separation of Concurrent Sound Sources using Self-Splitting Competitive Learning 2007 IEEE Symposium on Computational Intelligence in Image and Signal Processing. pp. 340- 345 ,(2007) , 10.1109/CIISP.2007.369192
Vasil Khalidov, Florence Forbes, Radu Horaud, Conjugate mixture models for clustering multimodal data Neural Computation. ,vol. 23, pp. 517- 557 ,(2011) , 10.1162/NECO_A_00074
Antoine Deleforge, Radu Horaud, The cocktail party robot Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction - HRI '12. pp. 431- 438 ,(2012) , 10.1145/2157689.2157834
Tony Jebara, Daniel P. Ellis, Michael I. Mandel, An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments neural information processing systems. ,vol. 19, pp. 953- 960 ,(2006) , 10.7916/D84176FK
E. Vincent, R. Gribonval, C. Fevotte, Performance measurement in blind audio source separation IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 14, pp. 1462- 1469 ,(2006) , 10.1109/TSA.2005.858005