作者: Mark Huckvale
DOI:
关键词:
摘要: As users are only too aware, contemporary large vocabulary speech recognition systems do not respond to in the same way as humans. The dictation that use today very sensitive disfluencies, restarts, background noise and change of speaker or voice quality. Furthermore mistakes they make seem be different ones humans even when listening poor environments. There is no doubt will become more comfortable act like a human listener. This should mean scientific knowledge about how process relevant important design these systems. Unlike situation early days field, it now case research into processing language has diverged from We have separate independent fields ‘psycholinguistics’ ‘spoken engineering’. article explores relationship between engineering cognitive science communities within relatively well-defined sub-field spoken word recognition. That we shall mainly concerned with processes by which sequences recovered acoustic input. three parts: roots divergence accounts explored first part. Differences motivation, methodology culture all seen play part historical context. second discusses potential benefits re-convergence two argues time ripe for progress now. Engineering stable successful enough worth interpreting terms, while sophisticated allow useful comparisons undertaken. final proposes some elements joint programme could stimulus work together. Highlighted priming phenomena relate recent adaptation, morphological problems selection use. Other possibilities phonetic reduction at low end, semantic grouping phrasing high end both machine