A Neural Approach for Text Extraction from Scholarly Figures

作者: David Morris , Peichen Tang , Ralph Ewerth

DOI: 10.1109/ICDAR.2019.00231

关键词: Computer scienceInformation retrievalLine (text file)Container (abstract data type)Convolutional neural networkArtificial neural networkPipeline (software)

摘要: In recent years, the problem of scene text extraction from images has received extensive attention and significant progress. However, scholarly figures such as plots charts remains an open problem, in part due to difficulty locating irregularly placed lines. To best our knowledge, literature not described implementation a system for that adapts deep convolutional neural networks used detection. this paper, we propose approach forgoes preprocessing favor using network line localization. Our uses publicly available detection whose architecture is well suited figures. Training data are derived arXiv papers which extracted Allen Institute's pdffigures tool. Since tool analyzes PDF container format order extract location through mechanisms render it, were able gather large set labeled training samples. We show improvement methods literature, discuss structural changes pipeline.

参考文章(18)
Yingying Zhu, Cong Yao, Xiang Bai, Scene text detection and recognition: recent advances and future trends Frontiers of Computer Science. ,vol. 10, pp. 19- 36 ,(2016) , 10.1007/S11704-015-4488-0
Weihua Huang, Chew Lim Tan, A system for understanding imaged infographics and its applications document engineering. pp. 9- 18 ,(2007) , 10.1145/1284420.1284427
Mohammad Reza Yousefi, Mohammad Reza Soheili, Thomas M. Breuel, Ehsanollah Kabir, Didier Stricker, Binarization-free OCR for historical documents using LSTM networks international conference on document analysis and recognition. pp. 1121- 1125 ,(2015) , 10.1109/ICDAR.2015.7333935
Songhua Xu, Michael Krauthammer, A new pivoting and iterative text detection algorithm for biomedical images Journal of Biomedical Informatics. ,vol. 43, pp. 924- 931 ,(2010) , 10.1016/J.JBI.2010.09.006
Baoguang Shi, Xiang Bai, Cong Yao, An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 39, pp. 2298- 2304 ,(2017) , 10.1109/TPAMI.2016.2646371
Jerzy Sas, Andrzej Zolnierek, Three-Stage Method of Text Region Extraction from Diagram Raster Images computer recognition systems. pp. 527- 538 ,(2013) , 10.1007/978-3-319-00969-8_52
K. C. Santosh, Aafaque Aafaque, Sameer Antani, George R. Thoma, Line Segment-Based Stitched Multipanel Figure Separation for Effective Biomedical CBIR International Journal of Pattern Recognition and Artificial Intelligence. ,vol. 31, pp. 1757003- ,(2017) , 10.1142/S0218001417570038
Muhammad Zeshan Afzal, Andreas Kolsch, Sheraz Ahmed, Marcus Liwicki, Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification international conference on document analysis and recognition. pp. 883- 888 ,(2017) , 10.1109/ICDAR.2017.149
Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, Jiajun Liang, EAST: An Efficient and Accurate Scene Text Detector computer vision and pattern recognition. ,vol. 2017, pp. 2642- 2651 ,(2017) , 10.1109/CVPR.2017.283
Dominik Moritz, Text detection in screen images with a Convolutional Neural Network The Journal of Open Source Software. ,vol. 2, pp. 235- ,(2017) , 10.21105/JOSS.00235