Lightweight and optimized sound source localization and tracking methods for open and closed microphone array configurations

作者: François Grondin , François Michaud

DOI: 10.1016/J.ROBOT.2019.01.002

关键词:

摘要: Abstract Human–robot interaction in natural settings requires filtering out the different sources of sounds from environment. Such ability usually involves use microphone arrays to localize, track and separate sound online. Multi-microphone signal processing techniques can improve robustness noise but cost increases with number microphones used, limiting response time widespread on types mobile robots. Since source localization methods are most expensive terms computing resources as they involve scanning a large 3D space, minimizing amount computations required would facilitate their implementation The robot’s shape also brings constraints array geometry configurations. In addition, return noisy features that need be smoothed filtered by tracking sources. This paper presents novel method, called SRP-PHAT-HSDA, scans space coarse fine resolution grids reduce memory lookups. A directivity model is used directions scan ignore non significant pairs microphones. configuration method introduced automatically set parameters normally empirically tuned according array. For tracking, this modified Kalman (M3K) capable simultaneously Using 16-microphone low hardware, results show SRP-PHAT-HSDA M3K perform at least well other while using up 4 30 times less respectively.

参考文章(37)
John McDonough, Matthias Woelfel, Distant Speech Recognition ,(2009)
François Grondin, Dominic Létourneau, François Ferland, Vincent Rousseau, François Michaud, The ManyEars open framework Autonomous Robots. ,vol. 34, pp. 217- 232 ,(2013) , 10.1007/S10514-012-9316-X
Benedikt Loesch, Bin Yang, Blind source separation based on time-frequency sparseness in the presence of spatial aliasing international conference on latent variable analysis and signal separation. pp. 1- 8 ,(2010) , 10.1007/978-3-642-15995-4_1
Joseph H. DiBiase, Harvey F. Silverman, Michael S. Brandstein, Robust Localization in Reverberant Rooms Micropone Arrays : Signal Processing Techniques and Applications. pp. 157- 180 ,(2001) , 10.1007/978-3-662-04619-7_8
J. Vermaak, A. Blake, Nonlinear filtering for speaker tracking in noisy and reverberant environments international conference on acoustics, speech, and signal processing. ,vol. 5, pp. 3021- 3024 ,(2001) , 10.1109/ICASSP.2001.940294
Kenichi Kumatani, John McDonough, Bhiksha Raj, Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors IEEE Signal Processing Magazine. ,vol. 29, pp. 127- 140 ,(2012) , 10.1109/MSP.2012.2205285
Maxime Frechette, Dominic Letourneau, Jean-Marc Valin, Francois Michaud, Integration of sound source localization and separation to improve Dialogue Management on a robot 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 2358- 2363 ,(2012) , 10.1109/IROS.2012.6385565
Leonardo O. Nunes, Wallace A. Martins, Markus V. S. Lima, Luiz W. P. Biscainho, Mauriacio V. M. Costa, Felipe M. Goncalves, Amir Said, Bowon Lee, A Steered-Response Power Algorithm Employing Hierarchical Search for Acoustic Source Localization Using Microphone Arrays IEEE Transactions on Signal Processing. ,vol. 62, pp. 5171- 5183 ,(2014) , 10.1109/TSP.2014.2336636
Simon J. Julier, Jeffrey K. Uhlmann, New extension of the Kalman filter to nonlinear systems Signal processing, sensor fusion, and target recognition. Conference. ,vol. 3068, pp. 182- 193 ,(1997) , 10.1117/12.280797
Francesco Nesta, Maurizio Omologo, Generalized State Coherence Transform for Multidimensional TDOA Estimation of Multiple Sources IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 20, pp. 246- 260 ,(2012) , 10.1109/TASL.2011.2160168