作者: Laurent Denoue , Francine Chen , Patrick Chiu
DOI:
关键词:
摘要: A system and method to identify pictures in documents. An image representing a page of document is received. The analyzed text objects the page. masked generated by masking out regions including Groups pixels are identified, wherein respective group corresponds at least one picture When there or more groups pixels, for identified based on pixels. Metadata tags stored, metadata tag includes information about bounding box picture.