Document image database indexing with pictorial dictionary

作者: Mohammad Akbari , Reza Azimi

DOI: 10.1117/12.856302

关键词:

摘要: In this paper we introduce a new approach for information retrieval from Persian document image database without using Optical Character Recognition (OCR).At first an attribute called subword upper contour label is defined then, a pictorial dictionary constructed based on the subwords. By address two issues in document retrieval: keyword spotting and according to similarities. The proposed methods have been evaluated database. results have proved ability of retrieval.

参考文章(16)
Adnan Amin, Off-line Arabic character recognition: the state of the art Pattern Recognition. ,vol. 31, pp. 517- 530 ,(1998) , 10.1016/S0031-3203(97)00084-8
M. Mitra, B.B. Chaudhuri, Information Retrieval from Documents: A Survey Information Retrieval. ,vol. 2, pp. 141- 163 ,(2000) , 10.1023/A:1009950525500
A.F. Smeaton, A.L. Spitz, Using character shape coding for information retrieval international conference on document analysis and recognition. ,vol. 2, pp. 974- 978 ,(1997) , 10.1109/ICDAR.1997.620655
Jinhui Liu, A.K. Jain, Image-based form document retrieval Pattern Recognition. ,vol. 33, pp. 503- 513 ,(2000) , 10.1016/S0031-3203(99)00066-7
Debashish Niyogi, Sargur N. Srihari, Use of document structure analysis to retrieve information from documents in digital libraries electronic imaging. ,vol. 3027, pp. 207- 218 ,(1997) , 10.1117/12.270074
Francine R Chen, Dan S Bloomberg, Summarization of Imaged Documents without OCR Computer Vision and Image Understanding. ,vol. 70, pp. 307- 320 ,(1998) , 10.1006/CVIU.1998.0688
Kazem Taghva, Julie Borsack, Allen Condit, Srinivas Erva, The effects of noisy data on text retrieval Journal of the American Society for Information Science. ,vol. 45, pp. 50- 58 ,(1994) , 10.1002/(SICI)1097-4571(199401)45:1<50::AID-ASI6>3.0.CO;2-B
A. Kolcz, J. Alspector, M. Augusteijn, R. Carlson, G. Viorel Popescu, A Line-Oriented Approach to Word Spotting in Handwritten Documents Pattern Analysis and Applications. ,vol. 3, pp. 153- 168 ,(2000) , 10.1007/S100440070020
Gerd Maderlechner, Peter Suda, Thomas Brückner, Classification of documents by form and content Pattern Recognition Letters. ,vol. 18, pp. 1225- 1231 ,(1997) , 10.1016/S0167-8655(97)00098-6
Zhaohui Yu, Chew Lim Tan, Image-based document vectors for text retrieval international conference on pattern recognition. ,vol. 4, pp. 393- 396 ,(2000) , 10.1109/ICPR.2000.902941