Unified auditory functions based on Bayesian topic model

作者: Takuma Otsuka , Katsuhiko Ishiguro , Hiroshi Sawada , Hiroshi G. Okuno

DOI: 10.1109/IROS.2012.6385787

关键词:

摘要: Existing auditory functions for robots such as sound source localization and separation have been implemented in a cascaded framework whose overall performance may be degraded by any failure its subsystems. These approaches often require careful environment-dependent tuning each subsystems to achieve better performance. This paper presents unified where the whole system is integrated Bayesian topic model. method improves both with common configuration under various environments iterative inference using Gibbs sampling. Experimental results from three of different reverberation times confirm that our outperforms state-of-the-art methods, especially reverberant environments, shows comparable existing robot audition system.

参考文章(24)
David M Blei, Andrew Y Ng, Michael I Jordan, None, Latent dirichlet allocation Journal of Machine Learning Research. ,vol. 3, pp. 993- 1022 ,(2003) , 10.5555/944919.944937
Yoko Sasaki, Satoshi Kagami, Hiroshi Mizoguchi, Online Short-Term Multiple Sound Source Mapping for a Mobile Robot by Robust Motion Triangulation Advanced Robotics. ,vol. 23, pp. 145- 164 ,(2009) , 10.1163/156855308X392717
Intae Lee, Taesu Kim, Te-Won Lee, Fast fixed-point independent vector analysis algorithms for convolutive blind source separation Signal Processing. ,vol. 87, pp. 1859- 1871 ,(2007) , 10.1016/J.SIGPRO.2007.01.010
T. L. Griffiths, M. Steyvers, Finding scientific topics Proceedings of the National Academy of Sciences of the United States of America. ,vol. 101, pp. 5228- 5235 ,(2004) , 10.1073/PNAS.0307752101
Hiroshi Sawada, Shoko Araki, Shoji Makino, Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 19, pp. 516- 527 ,(2011) , 10.1109/TASL.2010.2051355
Kazuhiro Nakadai, Toru Takahashi, Hiroshi G. Okuno, Hirofumi Nakajima, Yuji Hasegawa, Hiroshi Tsujino, Design and Implementation of Robot Audition System 'HARK' — Open Source Software for Listening to Three Simultaneous Speakers Advanced Robotics. ,vol. 24, pp. 739- 761 ,(2010) , 10.1163/016918610X493561
Nobutaka Ono, Stable and fast update rules for independent vector analysis based on auxiliary function technique workshop on applications of signal processing to audio and acoustics. pp. 189- 192 ,(2011) , 10.1109/ASPAA.2011.6082320
Michael D. Escobar, Mike West, Bayesian Density Estimation and Inference Using Mixtures Journal of the American Statistical Association. ,vol. 90, pp. 577- 588 ,(1995) , 10.1080/01621459.1995.10476550
H. Sawada, R. Mukai, S. Araki, S. Makino, A robust and precise method for solving the permutation problem of frequency-domain blind source separation IEEE Transactions on Speech and Audio Processing. ,vol. 12, pp. 530- 538 ,(2004) , 10.1109/TSA.2004.832994
Tony Jebara, Daniel P. Ellis, Michael I. Mandel, An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments neural information processing systems. ,vol. 19, pp. 953- 960 ,(2006) , 10.7916/D84176FK