作者: Manabu Kawabe , Takuro Sato , Yoshihito Shimazaki
DOI:
关键词: Computer science 、 Speech recognition 、 Natural language processing 、 Series (mathematics) 、 Word (computer architecture) 、 Artificial intelligence 、 Code word 、 Data compression 、 Association (psychology)
摘要: In a method of compressing text data including codewords characters, an input word each composed series one or more characters is extracted from the data, dictionary containing, as entries, words made up provided, codeword stored in association with word, occurrence counts respective are also stored, searched to find whether not matches any words, assigned which has been found match produced, count updated; and when introduced new word.