The Ravel data set

作者: Antoine Deleforge , Jordi Sanchez-Riera , Xavier Alameda-Pineda , Johannes Wienke , Radu Horaud

DOI:

关键词: Data miningTest algorithmRobotUsabilityData setEngineering

摘要: In this paper, we introduce the publicly available data set Ravel. All scenarios were recorded using AV robot head POPEYE, equipped with two cameras and four microphones. The recording environment was a regular meeting room enclosing all challenges of natural indoor scene. acquisition setup is fully detailed as well design scenarios. Two examples use are provided, proving usability Ravel set. Since current trend to robots able interact unconstrained environments, provides several test algorithms methods aiming satisfy constraints. at following URL: http://ravel.humavips.eu/

参考文章(19)
J. Matas, K. Messer, J. Kittler, Gilbert Maître, Juergen Luettin, XM2VTSDB: The Extended M2VTS Database Proc. Second International Conference on Audio- and Video-based Biometric Person Authentication (AVBPA'99). ,(1999)
Xavier Alameda-Pineda, Vasil Khalidov, Radu Horaud, Florence Forbes, Finding audio-visual events in informal social gatherings international conference on multimodal interfaces. pp. 247- 254 ,(2011) , 10.1145/2070481.2070527
Elise Arnaud, Elise Taillant, Florence Forbes, Radu Horaud, Heidi Christensen, Yan-Chen Lu, Jon Barker, Vasil Khalidov, Miles Hansard, Bertrand Holveck, Hervé Mathieu, Ramya Narasimha, The CAVA corpus Proceedings of the 10th international conference on Multimodal interfaces - IMCI '08. pp. 109- 116 ,(2008) , 10.1145/1452392.1452414
E. Colin Cherry, Some Experiments on the Recognition of Speech, with One and with Two Ears The Journal of the Acoustical Society of America. ,vol. 25, pp. 975- 979 ,(1953) , 10.1121/1.1907229
Vasil Khalidov, Florence Forbes, Radu Horaud, Conjugate mixture models for clustering multimodal data Neural Computation. ,vol. 23, pp. 517- 557 ,(2011) , 10.1162/NECO_A_00074
Remi Ronfard, Edmond Boyer, Daniel Weinland, Free viewpoint action recognition using motion history volumes Computer Vision and Image Understanding. ,vol. 104, pp. 249- 257 ,(2006) , 10.1016/J.CVIU.2006.07.013
Martin Cooke, Jon Barker, Stuart Cunningham, Xu Shao, An audio-visual corpus for speech perception and automatic speech recognition Journal of the Acoustical Society of America. ,vol. 120, pp. 2421- 2424 ,(2006) , 10.1121/1.2229005
Timothy J. Hazen, Kate Saenko, Chia-Hao La, James R. Glass, A segment-based audio-visual speech recognizer: data collection, development, and initial experiments international conference on multimodal interfaces. pp. 235- 242 ,(2004) , 10.1145/1027933.1027972
E.K. Patterson, S. Gurbuz, Z. Tufekci, J.N. Gowdy, CUAVE: A new audio-visual database for multimodal human-computer interface research IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 2, pp. 2017- 2020 ,(2002) , 10.1109/ICASSP.2002.5745028
Geert Willems, Jan Hendrik Becker, Tinne Tuytelaars, Luc Van Gool, Exemplar-based Action Recognition in Video british machine vision conference. pp. 1- 11 ,(2009) , 10.5244/C.23.90