作者: Riaz Ahmad , M. Zeshan Afzal , S. Faisal Rashid , Marcus Liwicki , Thomas Breuel
关键词:
摘要: This paper presents the first Pashto text image database for scientific research and thereby dataset with complete handwritten printed line images which ultimately covers all alphabets of Arabic Persian languages. Language like Pashto, written in a complex way by calligraphers, still requires mature Optical Character Recognition (OCR), system. Although 50 million people use this language both oral communication, there is no significant effort devoted to recognition Script. A real 17,015 having lines introduced. The are acquired via scanning from hand scribed books. Further, work, we evaluated performance deep learning based models Bidirectional Multi-Dimensional Long Short Term Memory (BLSTM MDLSTM) networks texts provide baseline character error rate 9.22%.