作者: T. CINCAREK , H. KAWANAMI , R. NISIMURA , A. LEE , H. SARUWATARI
DOI: 10.1093/IETISY/E91-D.3.576
关键词:
摘要: In this paper, the development, long-term operation and portability of a practical ASR application in real environment is investigated. The target speech-oriented guidance system installed at local community center. has been exposed to ordinary people since November 2002. More than 300 hours or more 700,000 inputs have collected during four years. outcome rare example large scale real-environment speech database. A simulation experiment carried out with database investigate how system's performance improves first two years operation. purpose determine empirically amount data which be prepared build reasonable recognition response accuracy. Furthermore, relative importance developing main components, i.e. recognizer generation module, assessed. Although depending on modeling capacities domain complexity, experimental results show that overall stagnates after employing about 10-15 k utterances for training acoustic model, 40–50 language model 40 k–50 compiling question answer Q&A was most important improving Finally, well-trained prototype different environment, subway station, Since collection preparation amounts impractical general, only one month from new employed adaptation. While component high degree portability, accuracy lower environment. reason difference between systems, they are environments. This implicates it imperative take behavior users under conditions into account user satisfaction.