Selective back-off smoothing for incorporating grammatical constraints into the n-gram language model.

作者: Katunobu Itou , Atsushi Fujii , Tetsuya Ishikawa , Tomoyosi Akiba

DOI:

关键词:

摘要: ABSTRACTSpoken queries submitted to question answering systemsusually consist of query contents (e.g. about newspaper ar-ticles) and frozen patterns WH-words), which can bemodeled with N-gram models grammar-based models,respectively. We propose a method integrate those differ-ent types into single model. rep-resent the two language in wordnetwork. However, common smoothing methods, whichare effective for models, decrease grammatical con-straints patterns. For this problem, we proposea selective back-off method, controls adegree is applied depending net-work fragment. Additionally, resulting are compati-ble conventional thusexisting decoders easily be used. show theeffectiveness our by way experiments.1. INTRODUCTIONThe model has been used successfully as languagemodel large vocabulary continuous speech recognition(LVCSR) systems. The simple but ro-bust enough all word sequences vocabulary.However, it needs training corpus such corpuscannot constructed unless there already exists alarge text based on, example, articles.On other hand, alanguage tasks involving relatively small vocab-ulary. This does not need because ittakes advantage linguistic knowledge. It corre-lations more distant than possible model,which only local relations between words.Thus, some spoken sentences modeled suit-ably suitably bythe also true from an intra-sentence perspective – parts sentence best mod-eled bya model.For systems receive queriesthat often part that conveys various con-tents about, articles, thatrepresents pattern sentences. firstpart seems dealt using modeltrained secondpart

参考文章(4)
Tatsuya Kawahara, Akinobu Lee, Kiyohiro Shikano, Julius --- An Open Source Real-Time Large Vocabulary Recognition Engine conference of the international speech communication association. ,vol. 3, pp. 1691- 1694 ,(2001)
P. Placeway, R. Schwartz, P. Fung, L. Nguyen, The estimation of powerful language models from small and large corpora IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 2, pp. 33- 36 ,(1993) , 10.1109/ICASSP.1993.319222
Fernando C. N. Pereira, Rebecca N. Wright, FINITE-STATE APPROXIMATION OF PHRASE STRUCTURE GRAMMARS meeting of the association for computational linguistics. pp. 246- 255 ,(1991) , 10.3115/981344.981376
Dawn M. Tice, Ellen M. Voorhees, The TREC-8 Question Answering Track Evaluation. text retrieval conference. ,(1999)