作者: S. Mao , A. Rosenfeld , T. Kanungo
DOI: 10.1109/ICIP.2003.1247016
关键词: Hidden Markov model 、 Hierarchy (mathematics) 、 Information retrieval 、 Search engine indexing 、 Structure (mathematical logic) 、 Title page 、 Computer science 、 Image retrieval 、 Tree (data structure) 、 k-d tree
摘要: Structural information about a document is essential for structured query processing, indexing, and retrieval. A page can be partitioned into hierarchy of homogeneous regions such as columns, paragraphs, etc.; these are called physical components, define the layout page. In this paper we develop class models layouts technical title pages. We model using hidden semiMarkov directional projections regions, stochastic attributed K-d tree grammar 2D hierarchical structure regions. use to generate sets synthetic images three distinctive styles, which in controlled experiments on analysis.