Reviewing Speech Input with Audio: Differences between Blind and Sighted Users

作者: Jonggi Hong , Christine Vaing , Hernisa Kacorri , Leah Findlater

DOI: 10.1145/3382039

关键词:

摘要: Speech input is a primary method of interaction for blind mobile device users, yet the process dictating and reviewing recognized text through audio only (i.e., without access to visual feedback) has received little attention. A recent study found that sighted users could identify about half automatic speech recognition (ASR) errors when listening text-to-speech output ASR results. Blind screen reader in contrast, may be better able due their greater use increased ability comprehend synthesized speech. To compare experiences with errors, as well audio-only interaction, we conducted lab 12 participants. The included semi-structured interview portion qualitatively understand ASR, followed by controlled task quantitatively participants’ dictated text. Findings revealed differences between participants terms how they level concern (e.g., were more highly concerned). In task, 40% which, counter our hypothesis, was not significantly different from performance. depth analysis input, strategy identifying scrutinized entered reviewed it. Our findings indicate need future work on support confidently using generate accurate, error-free

参考文章(50)
Arnout R. H. Fischer, Kathleen J. Price, Andrew Sears, Speech-Based Text Entry for Mobile Handheld Devices: An Analysis of Efficacy and Error Correction Techniques for Server-Based Solutions International Journal of Human-computer Interaction. ,vol. 19, pp. 279- 304 ,(2005) , 10.1207/S15327590IJHC1903_1
Yu Zhong, T. V. Raman, Casey Burkhardt, Fadi Biadsy, Jeffrey P. Bigham, JustSpeak: enabling universal voice control on Android Proceedings of the 11th Web for All Conference. pp. 36- ,(2014) , 10.1145/2596695.2596720
Stephen Cox, Michael Lincoln, Judy Tryggvason, Melanie Nakisa, Mark Wells, Marcus Tutt, Sanja Abbott, Tessa, a system to aid communication with deaf people conference on computers and accessibility. pp. 205- 212 ,(2002) , 10.1145/638249.638287
Virginia Braun, Victoria Clarke, Using thematic analysis in psychology Qualitative Research in Psychology. ,vol. 3, pp. 77- 101 ,(2006) , 10.1191/1478088706QP063OA
Susumu Harada, James A. Landay, Jonathan Malkin, Xiao Li, Jeff A. Bilmes, The vocal joystick: evaluation of voice-based cursor control techniques conference on computers and accessibility. pp. 197- 204 ,(2006) , 10.1145/1168987.1169021
E. Colin Cherry, Some Experiments on the Recognition of Speech, with One and with Two Ears The Journal of the Acoustical Society of America. ,vol. 25, pp. 975- 979 ,(1953) , 10.1121/1.1907229
Vikas Ashok, Yevgen Borodin, Yury Puzis, I. V. Ramakrishnan, Capti-speak: a speech-enabled web screen reader Proceedings of the 12th International Web for All Conference. pp. 22- ,(2015) , 10.1145/2745555.2746660
Oscar Saz, Shou-Chun Yin, Eduardo Lleida, Richard Rose, Carlos Vaquero, William R. Rodríguez, Tools and Technologies for Computer-Aided Speech and Language Therapy Speech Communication. ,vol. 51, pp. 948- 967 ,(2009) , 10.1016/J.SPECOM.2009.04.006
Clare-Marie Karat, Christine Halverson, Daniel Horn, John Karat, Patterns of entry and correction in large vocabulary continuous speech recognition systems human factors in computing systems. pp. 568- 575 ,(1999) , 10.1145/302979.303160
Yevgen Borodin, Jalal Mahmud, I. V. Ramakrishnan, Amanda Stent, The HearSay non-visual web browser conference on web accessibility. pp. 128- 129 ,(2007) , 10.1145/1243441.1243444