Cell identification in table analysis

作者: John C. Handley

DOI:

关键词:

摘要: The present invention handles fully-lined, semi-lined and line-less cell tables by identifying the cells separators during page recomposition processes as part of optical character recognition processes. accomplishes such iteratively cells. this merging word boxes into cells, finding separators, bounded same repeating these steps until correct structure is found. With method, rows are estimated, close words merged columns then within merged, re-estimated, in row column bigger according to detection various table styles. This large complex with multiple lines symbols per cell. method line lined, tables.

参考文章(14)
Michiyoshi Tachikawa, Table region identification method ,(1990)
Lamott G. Oren, Walter J. Buehring, Brian M. Kennedy, Model-independent and interactive report generation system and method of operation ,(1995)
Gerald Zaks, Roberto Salama, Dan Adler, Computer-based system and method for data processing ,(1995)
M. Armon Rahgozar, Robert Cooperman, Graph-based table recognition system Electronic Imaging: Science and Technology. ,vol. 2660, pp. 192- 203 ,(1996) , 10.1117/12.234700
K. Itonori, Table structure recognition based on textblock arrangement and ruled line position international conference on document analysis and recognition. pp. 765- 768 ,(1993) , 10.1109/ICDAR.1993.395625
O. Hori, D.S. Doermann, Robust table-form structure analysis based on box-driven reasoning international conference on document analysis and recognition. ,vol. 1, pp. 218- 221 ,(1995) , 10.1109/ICDAR.1995.598980