作者: Riaz Ahmad , Saeeda Naz , M. Zeshan Afzal , S. Faisal Rashid , Marcus Liwicki
关键词: Arabic script 、 Artificial intelligence 、 Scripting language 、 Natural language processing 、 Computer science 、 Optical character recognition 、 Urdu 、 Transfer of learning 、 Arabic 、 Pashto 、 Persian
摘要: Many languages use Arabic script for written communication either in basic or augmented form. These include Urdu, Pashto, Persian, etc. As the primary characters are shared among all these languages, it is possible to take advantage of visual similarities Optical Character Recognition (OCR). OCR models optimized individual have been proposed. However, best our knowledge, there no attempt develop a single system more than one language. The contributions presented work are: First, investigates effect on recognition accuracy when different combined (A pioneering study). Second, introduces publicly available synthetic datasets and Pashto experimental purposes. Third, this paper provides statistical analysis as clues transfer learning concerning systems Arabic, languages.