作者: Jonathan Hamaker , Neeraj Deshmukh , Aravind Ganapathiraju , Joseph Picone
DOI:
关键词:
摘要: The high cost of the infrastructure required to conduct state-of-the-art speech recognition research prevents many small groups from evaluating new ideas on large-scale tasks. To overcome this barrier, we are developing an Internet-based speechto-text (STT) toolkit. In paper, present core component system: a decoder that uses one-pass time-synchronous Viterbi-based search algorithm called trace projection. This can support efficient lattice rescoring using cross-word triphones, lexical trees and n-gram grammars. performance in terms CPU memory usage is par with commercial systems its kind. Preliminary evaluations SWITCHBOARD (SWB) corpus have yielded word error rate 39%.