作者: Mengli Cheng , Minghui Qiu , Xing Shi , Jun Huang , Wei Lin
关键词: One shot 、 Text detection 、 Field (computer science) 、 Computer science 、 Information retrieval 、 Task (project management) 、 Conditional random field 、 Information extraction 、 Belief propagation 、 Structure (mathematical logic)
摘要: Structured information extraction from document images usually consists of three steps: text detection, recognition, and field labeling. While detection recognition have been heavily studied improved a lot in literature, labeling is less explored still faces many challenges. Existing learning based methods for task require large amount labeled examples to train specific model each type document. However, collecting amounts them difficult sometimes impossible due privacy issues. Deploying separate models also consumes resources. Facing these challenges, we explore one-shot the task. are mostly rule-based difficulty fields crowded regions with few landmarks consisting multiple regions. To alleviate problems, proposed novel deep end-to-end trainable approach labeling, which makes use attention mechanism transfer layout between images. We further applied conditional random on transferred refinement collected annotated real-world dataset variety types conducted extensive experiments examine effectiveness model. stimulate research this direction, will be released (https://github.com/AlibabaPAI/one_shot_text_labeling).