摘要: Due to Arabic's morphological complexity, Arabic retrieval benefits greatly from analysis -- particularly stemming. However, the best known stemming does not handle linguistic phenomena such as broken plurals and malformed stems. In this paper we propose a model of character-level transformation that is trained using Wikipedia hypertext page title links. The use our yields statistically significant improvements in over statistical technique. technique can potentially be applied other languages.