MiPad: a multimodal interaction prototype

作者: X. Huang , A. Acero , C. Chelba , L. Deng , J. Droppo

DOI: 10.1109/ICASSP.2001.940754

关键词:

摘要: Dr. Who is a Microsoft research project aiming at creating speech-centric multimodal interaction framework, which serves as the foundation for NET natural user interface. MiPad application prototype that demonstrates compelling advantages wireless personal digital assistant (PDA) devices, fully integrates continuous speech recognition (CSR) and spoken language understanding (SLU) to enable users accomplish many common tasks using interface technologies. It tries solve problem of pecking with tiny styluses or typing on minuscule keyboards in today's PDAs. Unlike cellular phone, avoids speech-only interaction. incorporates built-in microphone activates whenever field selected. As taps screen uses built roller navigate, tapping action narrows number possible instructions word understanding. currently runs Windows CE Pocket PC 2000 machine where performed. The Dr CSR engine unified CFG n-gram model. SLU based robust chart parser plan-based dialog manager. paper discusses MiPad's design, implementation work progress, preliminary study comparison existing pen-based PDA

参考文章(6)
Ye-Yi Wang, Robust Spoken Language Understanding in MiPad International Speech Communication Association. pp. 1555- 1558 ,(2001)
Alex Acero, Mike Plumpe, Li Deng, Xuedong Huang, Large-vocabulary speech recognition under adverse acoustic environments. conference of the international speech communication association. pp. 806- 809 ,(2000)
X. Huang, A. Acero, F. Alleva, M. Hwang, L. Jiang, M. Mahajan, From Sphinx-II to Whisper — Making Speech Recognition Usable Springer, Boston, MA. pp. 481- 508 ,(1996) , 10.1007/978-1-4613-1367-0_20
J. Droppo, A. Acero, Li Deng, Efficient on-line acoustic environment estimation for FCDCN in a continuous speech recognition system international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 209- 212 ,(2001) , 10.1109/ICASSP.2001.940804
Li Deng, A. Acero, Li Jiang, J. Droppo, Xuedong Huang, High-performance robust speech recognition using stereo training data international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 301- 304 ,(2001) , 10.1109/ICASSP.2001.940827
Ye-Yi Wang, M. Mahajan, Xuedong Huang, A unified context-free grammar and n-gram model for spoken language processing international conference on acoustics, speech, and signal processing. ,vol. 3, pp. 1639- 1642 ,(2000) , 10.1109/ICASSP.2000.862062