Logical structure detection for heterogeneous document classes

作者: Leon Todoran , Marco Aiello , Christof Monz , Marcel Worring

DOI: 10.1117/12.410827

关键词:

摘要: We present a fully implemented system based on generic document knowledge for detecting the logical structure of documents which only general layout information is assumed. In particular, we focus reading order. Our integrates components computer vision, artificial intelligence, and natural language processing techniques. The prominent feature our framework its ability to handle from heterogeneous collections. has been evaluated standard collection measure quality order detection. Experimental results each component as whole are presented discussed in detail. performance promising, especially when considering diversity collection.

参考文章(0)