Offline Printed Urdu Nastaleeq Script Recognition with Bidirectional LSTM Networks

作者: Adnan Ul-Hasan , Saad Bin Ahmed , Faisal Rashid , Faisal Shafait , Thomas M. Breuel

DOI: 10.1109/ICDAR.2013.212

关键词:

摘要: Recurrent neural networks (RNN) have been successfully applied for recognition of cursive handwritten documents, both in English and Arabic scripts. Ability RNNs to model context sequence data like speech text makes them a suitable candidate develop OCR systems printed Nabataean scripts (including Nastaleeq which no system is available date). In this work, we presented the results applying RNN Urdu script. Bidirectional Long Short Term Memory (BLSTM) architecture with Connectionist Temporal Classification (CTC) output layer was employed recognize text. We evaluated BLSTM two cases: one ignoring character's shape variations second considering them. The error rate at character level first case 5.15% 13.6%. These were obtained on synthetically generated UPTI dataset containing artificially degraded images reflect some real-world scanning artifacts along clean images. Comparison shape-matching based method also presented.

参考文章(22)
Alex Graves, Douglas Eck, Nicole Beringer, Juergen Schmidhuber, Biologically Plausible Speech Recognition with LSTM Neural Nets Biologically Inspired Approaches to Advanced Information Technology. pp. 127- 136 ,(2004) , 10.1007/978-3-540-27835-1_10
Alex Graves, Supervised Sequence Labelling Springer, Berlin, Heidelberg. pp. 5- 13 ,(2012) , 10.1007/978-3-642-24797-2_2
Henry S. Baird, Document image defect models Document image analysis. pp. 315- 325 ,(1995) , 10.1007/978-3-642-77281-8_26
C. V Jawahar, Naveen Sankaran, Recognition of printed Devanagari text using BLSTM Neural Network international conference on pattern recognition. pp. 322- 325 ,(2012)
S. Belongie, J. Malik, J. Puzicha, Shape matching and object recognition using shape contexts IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 24, pp. 509- 522 ,(2002) , 10.1109/34.993558
Nazly Sabbour, Faisal Shafait, A segmentation-free approach to Arabic and Urdu OCR document recognition and retrieval. ,vol. 8658, pp. 1- 12 ,(2013) , 10.1117/12.2003731
V. Frinken, A. Fischer, R. Manmatha, H. Bunke, A Novel Word Spotting Method Based on Recurrent Neural Networks IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 34, pp. 211- 224 ,(2012) , 10.1109/TPAMI.2011.113
Francesco Camastra, A SVM-based cursive character recognizer Pattern Recognition. ,vol. 40, pp. 3721- 3727 ,(2007) , 10.1016/J.PATCOG.2007.03.014
Wolfgang Maass, Thomas Natschläger, Henry Markram, Real-time computing without stable states: a new framework for neural computation based on perturbations Neural Computation. ,vol. 14, pp. 2531- 2560 ,(2002) , 10.1162/089976602760407955