Speech recognition techniques for a sign language recognition system.

作者: Philippe Dreuw , David Rybach , Thomas Deselaers , Morteza Zahedi , Hermann Ney

DOI:

关键词: Speaker recognitionComputer scienceCued speechLogogen modelSign languageNatural language processingArtificial intelligenceIntelligent character recognitionGesture recognitionSpeech recognitionPronunciationLanguage model

摘要: One of the most significant differences between automatic sign language recognition (ASLR) and speech (ASR) is due to computer vision problems, whereas corresponding problems in signal processing have been solved intensive research last 30 years. We present our approach where we start from a large vocabulary system profit insights that obtained ASR research. The developed able recognize sentences continuous independent speaker. features used are standard video cameras without any special data acquisition devices. In particular, focus on feature model combination techniques applied ASR, usage pronunciation models (LM) language. These can be for all kind systems, many analysis temporal context important, e.g. action or gesture recognition. On publicly available benchmark database consisting 201 3 signers, achieve 17% WER.

参考文章(15)
Richard Bowden, David Windridge, Timor Kadir, Andrew Zisserman, Michael Brady, A Linguistic Feature Vector for the Visual Interpretation of Sign Language european conference on computer vision. pp. 390- 401 ,(2004) , 10.1007/978-3-540-24670-1_30
William C. Stokoe, Carl G. Croneberg, Dorothy C. Casterline, A dictionary of American sign language on linguistic principles Gallaudet College Press. ,(1965)
Ying Wu, Thomas S. Huang, Vision-Based Gesture Recognition: A Review GW '99 Proceedings of the International Gesture Workshop on Gesture-Based Communication in Human-Computer Interaction. pp. 103- 115 ,(1999) , 10.1007/3-540-46616-9_10
D. Keysers, R. Paredes, H. Ney, T. Kolsch, Enhancements for local feature based image classification international conference on pattern recognition. ,vol. 1, pp. 248- 251 ,(2004) , 10.1109/ICPR.2004.329
Dietrich Klakow, Jochen Peters, Testing the correlation of word error rate and perplexity Speech Communication. ,vol. 38, pp. 19- 28 ,(2002) , 10.1016/S0167-6393(01)00041-3
Guilin Yao, Hongxun Yao, Xin Liu, Feng Jiang, Real Time Large Vocabulary Continuous Sign Language Recognition Based on OP/Viterbi Algorithm international conference on pattern recognition. ,vol. 3, pp. 312- 315 ,(2006) , 10.1109/ICPR.2006.954
S.C.W. Ong, S. Ranganath, Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 27, pp. 873- 891 ,(2005) , 10.1109/TPAMI.2005.112
A. Agarwal, B. Triggs, Recovering 3D human pose from monocular images IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 28, pp. 44- 58 ,(2006) , 10.1109/TPAMI.2006.21
Daniel Keysers, Thomas Deselaers, Christian Gollan, Hermann Ney, Deformation Models for Image Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 29, pp. 1422- 1435 ,(2007) , 10.1109/TPAMI.2007.1153
Philippe Dreuw, Daniel Stein, Hermann Ney, Enhancing a Sign Language Translation System with Vision-Based Features Gesture-Based Human-Computer Interaction and Simulation. pp. 108- 113 ,(2009) , 10.1007/978-3-540-92865-2_11