Survey and empirical comparison of different approaches for text extraction from scholarly figures

作者: Falk Böschen , Tilman Beck , Ansgar Scherp

DOI: 10.1007/S11042-018-6162-7

关键词:

摘要: Different approaches have been proposed in the past to address challenge of extracting text from scholarly figures. However, until recently, no comparative evaluation different had conducted. Thus, we performed an extensive study related work and evaluated total 32 approaches. In this work, perform a more detailed comparison 7 most relevant described literature extend 37 systematic linear combinations methods for Our generic pipeline, consisting six steps, allows us freely combine possible fair comparison. Overall, 44 pipeline configurations systematically compared methods. We then derived two non-linear two-pass approach. evaluate all over four datasets figures origin characteristics. The quality extraction results is assessed using F-measure Levenshtein distance, measure runtime performance. experiments showed that there configuration overall shows best on datasets. Further can be improved by extending it Regarding runtime, observed huge differences very fast those running several weeks. found working our method set. they also further improvements regarding region classification are needed.

参考文章(27)
Daniel Chester, Stephanie Elzer, Getting computers to see information graphics so users do not have to international syposium on methodologies for intelligent systems. pp. 660- 668 ,(2005) , 10.1007/11425274_68
Li Yang, Weihua Huang, Chew Lim Tan, Semi-automatic ground truth generation for chart image recognition document analysis systems. pp. 324- 335 ,(2006) , 10.1007/11669487_29
Julinda Gllavata, Bernd Freisleben, Adaptive fuzzy text segmentation in images with complex backgrounds using color and texture computer analysis of images and patterns. pp. 756- 765 ,(2005) , 10.1007/11556121_93
Shijian Lu, Tao Chen, Shangxuan Tian, Joo-Hwee Lim, Chew-Lim Tan, Scene text extraction based on edges and support vector regression International Journal on Document Analysis and Recognition (IJDAR). ,vol. 18, pp. 125- 135 ,(2015) , 10.1007/S10032-015-0237-Z
Khurram Khurshid, Imran Siddiqi, Claudie Faure, Nicole Vincent, Comparison of Niblack inspired binarization methods for ancient documents document recognition and retrieval. ,vol. 7247, ,(2009) , 10.1117/12.805827
Peng Wu, Charles Greenbacker, Daniel Chester, Edward Schwartz, David Oliver, Priscilla Moraes, Sandra Carberry, Stephanie Elzer Schwartz, Kathleen Mccoy, Seniz Demir, Access to multimodal articles for individuals with sight impairments ACM Transactions on Interactive Intelligent Systems. ,vol. 2, pp. 1- 49 ,(2012) , 10.1145/2395123.2395126
Weihua Huang, Chew Lim Tan, A system for understanding imaged infographics and its applications document engineering. pp. 9- 18 ,(2007) , 10.1145/1284420.1284427
Marc Pierrot Deseilligny, Hervé Le Men, Georges Stamon, Character string recognition on maps, a rotation-invariant recognition method Pattern Recognition Letters. ,vol. 16, pp. 1297- 1310 ,(1995) , 10.1016/0167-8655(95)00084-5
Joanna Isabelle Olszewska, Active contour based optical character recognition for automated scene understanding Neurocomputing. ,vol. 161, pp. 65- 71 ,(2015) , 10.1016/J.NEUCOM.2014.12.089
J. Illingworth, J. Kittler, A survey of the Hough transform Graphical Models \/graphical Models and Image Processing \/computer Vision, Graphics, and Image Processing. ,vol. 44, pp. 87- 116 ,(1988) , 10.1016/S0734-189X(88)80033-1