作者: Xavier L. Aubert
关键词:
摘要: Abstract A number of decoding strategies for large vocabulary continuous speech recognition (LVCSR) are examined from the viewpoint their search space representation. Different design solutions compared with respect to integration linguistic and acoustic constraints, as implied by m -gram language models (LM) cross-word (CW) phonetic contexts. This study is structured along two main axes: network expansion algorithm itself. The can be expanded statically or dynamically while proceed either time-synchronously asynchronously which leads distinct architectures. Three broad classes methods briefly reviewed: use weighted finite state transducers (WFST) static expansion, time-synchronous dynamic-expansion asynchronous stack decoding. Heuristic further reducing also considered. approaches some prospective views formulated regarding possible future avenues.