作者: Young-Soog Chae
DOI:
关键词:
摘要: This paper presents the ``techniques of correcting for spelling errors, orthographical and grammatical errors in computer-based text. And this addresses an extension that goes beyond normal checking isolated single word by taking multi-words as well a sentence. The candidate words are created applying function rules correction rule table contains heuristic information collocation. To prevent excessive creation improve accuracy, we use high frequency dictionary 300,000 derived from corpus. For constituent grammar based partialparsing rules, collocation between can be found. We make experiment with techniques on corpora final result SERI``s research, texts, newspaper materials, public materials. system has 98% accuracy rate when 8.5% caused unregistered were excluded. average number prospective candidates suggested is 1.12.