SHACER: a Speech and Handwriting Recognizer

作者: Edward C. Kaiser

DOI:

关键词: HandwritingWhiteboardSpeech recognitionVocabularyNatural language processingSemanticsComputer scienceHandwriting recognitionSchedule (project management)SyllableArtificial intelligenceIntelligent character recognition

摘要: the task domain of a multi-party, multimodal meeting focused on creation whiteboard schedule chart, we have designed and implemented general method aligning handwriting speech for capturing out-of-vocabulary terms, dynamically enrolling them in system's recognition modules, then using to improve subsequent tracking recognition. Our approach involves use an ensemble syllable phoneme recognizers whose output is integrated with redundantly delivered We refer our conceptual framework as Multimodal Out-Of- Vocabulary Recognition (MOOVR — pronounced mover). Within that this paper describes Speech HAndwriting reCognizER module (SHACER shaker), which observes human-to-human spoken handwritten interactions, analyzes off-line contributes improved recognitions record form project schedule. examine example show how technique corrects four five label errors including implicitly discovering semantics abbreviation.

参考文章(20)
I. Lee Hetherington, Grace Chung, Stephanie Seneff, Chao Wang, A dynamic vocabulary spoken dialogue interface. conference of the international speech communication association. ,(2004)
Edward Filisko, Grace Chung, Stephanie Seneff, Min Tang, Chao Wang, Combining linguistic knowledge and acoustic information in automatic pronunciation lexicon generation. conference of the international speech communication association. ,(2004)
Philip R. Cohen, Edward C. Kaiser, Implementation testing of a hybrid symbolic/statistical multimodal architecture. conference of the international speech communication association. ,(2002)
James Glass, Issam Bazzi, Modelling out-of-vocabulary words for robust speech recognition conference of the international speech communication association. pp. 401- 404 ,(2002)
Thomas F. Stahovich, Levent Burak Kara, An Image-Based Trainable Symbol Recognizer for Sketch-Based Interfaces. national conference on artificial intelligence. pp. 99- 105 ,(2004)
Eric Saund, James Mahoney, Perceptual Support of Diagram Creation and Editing Lecture Notes in Computer Science. pp. 424- 427 ,(2004) , 10.1007/978-3-540-25931-2_55
K.-F. Lee, H.-W. Hon, M.-Y. Hwang, S. Mahajan, R. Reddy, The SPHINX speech recognition system international conference on acoustics, speech, and signal processing. pp. 445- 448 ,(1989) , 10.1109/ICASSP.1989.266459
T. Ko, D. Demirdjian, T. Darrell, Untethered gesture acquisition and recognition for a multimodal conversational system international conference on multimodal interfaces. pp. 147- 150 ,(2003) , 10.1145/958432.958461
Fei Xu, Joshua B. Tenenbaum, Word learning as Bayesian inference. conference cognitive science. ,vol. 114, pp. 245- 272 ,(2007) , 10.1037/0033-295X.114.2.245