PINK PANTHER: A COMPLETE ENVIRONMENT FOR GROUND-TRUTHING AND BENCHMARKING DOCUMENT PAGE SEGMENTATION

作者: BERRIN A. YANIKOGLU , LUC VINCENT

DOI: 10.1016/S0031-3203(97)00137-4

关键词: SegmentationGround truthDocument processingComputer visionOptical character recognitionArtificial intelligenceSet (abstract data type)Pattern recognition (psychology)Scale-space segmentationComputer science

摘要: We describe a new approach for the automatic evaluation of document page segmentation algorithms. Unlike techniques that rely on OCR output, our method is region-based: quality assessed by comparing described as set regions, to corresponding ground-truth. Error maps are used keep track all errors associated with each pixel, regardless complexity. Misclassifications, splitting, and merging regions among detected system. Each error can be weighted individually system customized benchmark virtually any type task.

参考文章(11)
Thomas A. Nartker, Frank R. Jenkins, Stephen V. Rice, The Fourth Annual Test of OCR Accuracy Information Science Research Institute Technical Report. ,(1995)
HENRY S. BAIRD, BACKGROUND STRUCTURE IN DOCUMENT IMAGES International Journal of Pattern Recognition and Artificial Intelligence. ,vol. 8, pp. 1013- 1030 ,(1994) , 10.1142/S0218001494000516
Esko Ukkonen, Algorithms for approximate string matching Information and Control. ,vol. 64, pp. 100- 118 ,(1985) , 10.1016/S0019-9958(85)80046-2
Theo Pavlidis, Jiangying Zhou, Page segmentation and classification CVGIP: Graphical Models and Image Processing. ,vol. 54, pp. 484- 496 ,(1992) , 10.1016/1049-9652(92)90068-9
Ying-Wei Lin, Digital image processing in the Xerox DocuTech document processing system IS&T/SPIE 1994 International Symposium on Electronic Imaging: Science and Technology. ,vol. 2181, pp. 264- 267 ,(1994) , 10.1117/12.171113
Randriamasy, Vincent, Benchmarking page segmentation algorithms computer vision and pattern recognition. pp. 411- 416 ,(1994) , 10.1109/CVPR.1994.323859
B.A. Yanikoglu, L. Vincent, Ground-truthing and benchmarking document page segmentation international conference on document analysis and recognition. ,vol. 2, pp. 601- 604 ,(1995) , 10.1109/ICDAR.1995.601968
J. Kanai, T.A. Nartker, S. Rice, G. Nagy, Performance metrics for document understanding systems Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93). pp. 424- 427 ,(1993) , 10.1109/ICDAR.1993.395703
A. Antonacopoulos, R.T. Ritchings, Flexible page segmentation using the background international conference on pattern recognition. ,vol. 2, pp. 339- 344 ,(1994) , 10.1109/ICPR.1994.576932
D. Olivier, B. Dominique, Segmentation of complex documents multilevel images: a robust and fast text bodies-headers detection and extraction scheme international conference on document analysis and recognition. ,vol. 2, pp. 770- 773 ,(1995) , 10.1109/ICDAR.1995.602016