Methods of Arabic Language Baseline Detection - The State of Art

作者: Atallah AL-Shatnawi , Khairuddin Omar

DOI:

关键词: Natural language processingPreprocessorWord (computer architecture)Normalization (image processing)Artificial intelligenceFeature extractionImaginary lineCharacter (computing)SegmentationComputer sciencePattern recognitionBaseline (configuration management)

摘要: Summary Preprocessing is the most important stage in Arabic OCR system; it has a direct effect on reliability and efficiency of segmentation feature extraction stages. It worth mentioning that language cursively written, its characters have between 2 to 4 shapes. An word likely consists two or more which are connected through an imaginary line called baseline. Detecting baseline one main majorities preprocessing system. The can be used for both skew normalization character segmentation. This paper aims provide comprehensive review methods proposed by researchers detect detection categorized into four methods: (a) based horizontal projection methods, (b) skeleton method, (c) contour tracing (d) principle component analysis method. Each these own advantages drawbacks.

参考文章(34)
Adnan Amin, Off-line Arabic character recognition: the state of the art Pattern Recognition. ,vol. 31, pp. 517- 530 ,(1998) , 10.1016/S0031-3203(97)00084-8
Horst O. Bunke, Patrick S.-P. Wang, Handbook of Character Recognition and Document Image Analysis ,(1997)
A.M. Zeki, The Segmentation Problem in Arabic Character Recognition The State Of The Art international conference on information and communication technologies. pp. 11- 26 ,(2005) , 10.1109/ICICT.2005.1598538
Omar I. Al Helalat, Ahmad M. Sarhan, Arabic Character Recognition using Artificial Neural Networks and Statistical Analysis World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering. ,vol. 1, pp. 506- 510 ,(2007)
K. bin Omar, R. bin Mahmoud, M.N. bin Sulaiman, A.R. bin Ramli, The removal of secondaries of Jawi characters ieee region 10 conference. ,vol. 2, pp. 149- 152 ,(2000) , 10.1109/TENCON.2000.888408
K. Romeo-Pakker, H. Miled, Y. Lecourtier, A new approach for Latin/Arabic character segmentation international conference on document analysis and recognition. ,vol. 2, pp. 874- 877 ,(1995) , 10.1109/ICDAR.1995.602040
H. Al-Yousefi, S.S. Udpa, Recognition of Arabic characters IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 14, pp. 853- 857 ,(1992) , 10.1109/34.149585
Bijan Timsari, Hamid Fahimi, Morphological approach to character recognition in machine-printed Persian words Document Recognition III. ,vol. 2660, pp. 184- 191 ,(1996) , 10.1117/12.234724
T. Steinherz, N. Intrator, E. Rivlin, Skew detection via principal components analysis international conference on document analysis and recognition. pp. 153- 156 ,(1999) , 10.1109/ICDAR.1999.791747
Badr Al-Badr, Sabri A. Mahmoud, Survey and bibliography of Arabic optical text recognition Signal Processing. ,vol. 41, pp. 49- 77 ,(1995) , 10.1016/0165-1684(94)00090-M