作者: Alex Trutnev
关键词:
摘要: In this work, we propose different strategies for efficiently integrating an automated speech recognition module in the framework of a dialogue-based vocal system. The aim is study ways leading to improvement quality and robustness recognition. We first concentrate on choice type acoustic models that should be used Our goal evaluate hypothesis hybrid models, which estimation frame-based phoneme probabilities made through artificial neural networks, provide performance results similar "classical" Hidden-Markov using Multi-Gaussian estimations, while being more robust generalization across tasks. experimentally show that, due size parameter space explored, it not always practically possible achieve comparable one fact often lead worse performance. second part, focus main limitations state-of-the-art recognition: inadequacy one-best approach yield corresponding right transcription. For explore solution consisting producing, during decoding, word lattice containing very large number hypotheses, then filtered by syntactic analyzer sophisticated such as stochastic context-free grammars. syntactically correct hypotheses further processing. More precisely, proach dynamically tuning relative importance language resulting increase lexical syntacticonsisting variability lattice. identify quantify two important drawbacks approach: its high computational cost impossibility guarantee practice, indeed present Finally, problem use generic linguistic resources (language phonetic lexica) efficient results. context, integration dynamic controlled associated dialogue model. approach, restricted lexicon dependent context are place complete ones. verify yields significant performance, given application, adequate model can integrate module. perspective, enhancement prototyping methodology error simulation within Wizard-of-Oz simulation. enables guarantees better adequacy targeted application.