Learning Algorithms for Document Layout Analysis

作者: Simone Marinai

DOI: 10.1016/B978-0-444-53859-8.00016-3

关键词:

摘要: Abstract In this chapter we describe several approaches that have been proposed to use learning algorithm analyze the layout of digitized documents. Layout analysis encompasses all techniques are used infer organization page document images. From a physical point view can be described as composed by blocks, in most cases rectangular, arranged and contain homogeneous content, such text, vectorial graphics, or illustrations. logical text blocks different meaning on basis their content position page. For instance, case technical papers correspond title, author, abstract paper. The algorithms adopted domain often related supervised classifiers at various processing levels label objects image according categories. classification performed for individual pixels, regions, even whole pages. using analyzed chapter.

参考文章(68)
Gerhard Paaß, Iuliu Konya, Machine Learning for Document Structure Recognition Modeling, Learning, and Processing of Text Technological Data Structures. pp. 221- 247 ,(2011) , 10.1007/978-3-642-22613-7_12
Majid Mirmehdi, Paul Clark, Finding Text Regions Using Localised Measures british machine vision conference. pp. 675- 684 ,(2000)
Floriana Esposito, Stefano Ferilli, Teresa M. A. Basile, Nicola Di Mauro, Machine Learning for Digital Document Processing: from Layout Analysis to Metadata Extraction Machine Learning in Document Analysis and Recognition. pp. 105- 138 ,(2008) , 10.1007/978-3-540-76280-5_5
José Luis Hidalgo, Salvador España, María José Castro, José Alberto Pérez, Enhancement and Cleaning of Handwritten Data by Using Neural Networks Pattern Recognition and Image Analysis. pp. 376- 383 ,(2005) , 10.1007/11492429_46
Abdel Belaïd, Yves Rangoni, Structure Extraction in Printed Documents Using Neural Approaches Machine Learning in Document Analysis and Recognition. pp. 21- 43 ,(2008) , 10.1007/978-3-540-76280-5_2
Simone Marinai, Introduction to Document Analysis and Recognition Machine Learning in Document Analysis and Recognition. pp. 1- 20 ,(2008) , 10.1007/978-3-540-76280-5_1
Jean Duong, Myrian Côté, Hubert Emptoz, Feature Approach for Printed Document Image Analysis Lecture Notes in Computer Science. pp. 159- 167 ,(2002) , 10.1007/3-540-70659-3_16
Zheru Chi, Qing Wang, Wan-Chi Siu, Hierarchical content classification and script determination for automatic document image processing Pattern Recognition. ,vol. 36, pp. 2483- 2500 ,(2003) , 10.1016/S0031-3203(03)00128-6