作者: Yoshiharu Sato , Miyuki Seki , Maeda Rie
DOI:
关键词:
摘要: Method for creating a language model capable of preventing deterioration quality caused by the conventional back-off to unigram. Parts-of-speech with same display and reading are obtained from storage device (206). A cluster (204) is created combining parts-of-speech. The stored in In addition, when an instruction (214) dividing inputted, (206) divided (210) accordance inputted (212). Two clusters combined (218), probability occurrence text corpus calculated (222). associated bigram indicating into device.