作者: Dennis J. Drown , Taghi M. Khoshgoftaar , Ramaswamy Narayanan
关键词:
摘要: We study the problem of correcting spelling mistakes in text using memory-based learning techniques and a very large database token n-gram occurrences web as training data. Our approach uses context which an error appears to select most likely candidate from words might have been intended its place. Using novel correction algorithm massive data, we demonstrate higher accuracy on real- word errors than previous work, high at new task ranking corrections non-word given by standard package.