A Robotic Auditory System that Interacts with Musical Sounds and Human Voices

作者: Hideyuki Sawada , , Toshiya Takechi

DOI: 10.20965/JACIII.2007.P1177

关键词: Computer scienceMusicalSound localizationMicrophone arrayAuditory systemSpeech recognition

摘要: Voice and sounds are the primary media employed for human communication. Humans are able to exchange information smoothly using voice under different situations, such as a noisy environment and in the presence of multiple speakers. We are surrounded by various sounds, and yet are able to detect the location of a sound source in 3D space, extract a particular sound from a mixture of sounds, and recognize the source of a specific sound. Also, music is composed of various sounds generated by musical instruments, and directly affects our emotions and feelings. This paper introduces real-time detection and identification of a particular sound among plural sound sources using a microphone array based on the location of a speaker and the tonal characteristics. The technique will also be applied to an adaptive auditory system of a robotic arm, which interacts with humans.

参考文章(9)
T. Nishiura, T. Yamada, S. Nakamura, K. Shikano, Localization of multiple sound sources based on a CSP analysis with a microphone array international conference on acoustics, speech, and signal processing. ,vol. 2, pp. 1053- 1056 ,(2000) , 10.1109/ICASSP.2000.859144
Masashi Unoki, Masato Akagi, A method for signal extraction from noise-added signals Electronics and Communications in Japan (Part III: Fundamental Electronic Science). ,vol. 80, pp. 1- 11 ,(1997) , 10.1002/(SICI)1520-6440(199711)80:11<1::AID-ECJC1>3.0.CO;2-8
Hideyuki Sawada, Minoru Ohkado, Identification and tracking of particular speaker in noisy environment Proceedings of SPIE - The International Society for Optical Engineering. ,vol. 5603, pp. 138- 145 ,(2004) , 10.1117/12.580588
J.L. Flanagan, A.C. Surendran, E.E. Jan, Spatially selective sound capture for speech and audio processing Speech Communication. ,vol. 13, pp. 207- 222 ,(1993) , 10.1016/0167-6393(93)90072-S
C. Knapp, G. Carter, The generalized correlation method for estimation of time delay IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 24, pp. 320- 327 ,(1976) , 10.1109/TASSP.1976.1162830
S. Imai, Cepstral analysis synthesis on the mel frequency scale international conference on acoustics, speech, and signal processing. ,vol. 8, pp. 93- 96 ,(1983) , 10.1109/ICASSP.1983.1172250
T. Takechi, K. Sugimoto, T. Mandono, H. Sawada, Automobile identification based on the measurement of car sounds conference of the industrial electronics society. ,vol. 2, pp. 1784- 1789 ,(2004) , 10.1109/IECON.2004.1431853
A. Nehorai, B. Porat, Adaptive comb filtering for harmonic signal enhancement IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 34, pp. 1124- 1138 ,(1986) , 10.1109/TASSP.1986.1164952
Y. Takeuchi, Problems and Latest Solutions for Doppler-Autocorrelation Fetal Heart Rate Measurement Technical report of IEICE. EA. ,vol. 96, pp. 23- 30 ,(1996)