作者: Philippe Dreuw , David Rybach , Thomas Deselaers , Morteza Zahedi , Hermann Ney
DOI:
关键词: Speaker recognition 、 Computer science 、 Cued speech 、 Logogen model 、 Sign language 、 Natural language processing 、 Artificial intelligence 、 Intelligent character recognition 、 Gesture recognition 、 Speech recognition 、 Pronunciation 、 Language model
摘要: One of the most significant differences between automatic sign language recognition (ASLR) and speech (ASR) is due to computer vision problems, whereas corresponding problems in signal processing have been solved intensive research last 30 years. We present our approach where we start from a large vocabulary system profit insights that obtained ASR research. The developed able recognize sentences continuous independent speaker. features used are standard video cameras without any special data acquisition devices. In particular, focus on feature model combination techniques applied ASR, usage pronunciation models (LM) language. These can be for all kind systems, many analysis temporal context important, e.g. action or gesture recognition. On publicly available benchmark database consisting 201 3 signers, achieve 17% WER.