A Novel Joint Character Categorization and Localization Approach for Character-Level Scene Text Recognition

作者: Xianbiao Qi , Yihao Chen , Rong Xiao , Chun-Guang Li , Qin Zou

DOI: 10.1109/ICDARW.2019.40086

关键词:

摘要: Scene text recognition has become an active research area in pattern recent years. Currently, the mainstream approach is image-based sequence model. However, such a model usually cannot yield accurate character-level category and location information. To address this deficiency, paper, we propose novel scene framework for simultaneously categorizing localizing characters. Moreover, present effective joint learning strategy to help learn from both annotation word-level annotation. Extensive experiments on five benchmark data sets, including IIIT-5K, SVT, ICDAR03, ICDAR13, ICDAR15, show promising results. Especially, confirm that our proposal more robust length variation non-language text.

参考文章(38)
Max Jaderberg, Andrea Vedaldi, Andrew Zisserman, Deep Features for Text Spotting european conference on computer vision. pp. 512- 528 ,(2014) , 10.1007/978-3-319-10593-2_34
Andrea Vedaldi, Max Jaderberg, Karen Simonyan, Andrew Zisserman, Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition arXiv: Computer Vision and Pattern Recognition. ,(2014)
Andrew Y. Ng, Adam Coates, David J. Wu, Tao Wang, End-to-end text recognition with convolutional neural networks international conference on pattern recognition. pp. 3304- 3308 ,(2012)
Albert Gordo, Supervised mid-level features for word image representation computer vision and pattern recognition. pp. 2956- 2964 ,(2015) , 10.1109/CVPR.2015.7298914
Max Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman, Reading Text in the Wild with Convolutional Neural Networks International Journal of Computer Vision. ,vol. 116, pp. 1- 20 ,(2016) , 10.1007/S11263-015-0823-Z
Cong Yao, Xiang Bai, Baoguang Shi, Wenyu Liu, Strokelets: A Learned Multi-scale Representation for Scene Text Recognition computer vision and pattern recognition. pp. 4042- 4049 ,(2014) , 10.1109/CVPR.2014.515
Jose A. Rodriguez-Serrano, Albert Gordo, Florent Perronnin, Label Embedding: A Frugal Baseline for Text Recognition International Journal of Computer Vision. ,vol. 113, pp. 193- 207 ,(2015) , 10.1007/S11263-014-0793-6
Kai Wang, Boris Babenko, Serge Belongie, End-to-end scene text recognition international conference on computer vision. pp. 1457- 1464 ,(2011) , 10.1109/ICCV.2011.6126402
Dimosthenis Karatzas, Faisal Shafait, Seiichi Uchida, Masakazu Iwamura, Lluis Gomez i Bigorda, Sergi Robles Mestre, Joan Mas, David Fernandez Mota, Jon Almazan Almazan, Lluis Pere de las Heras, ICDAR 2013 Robust Reading Competition international conference on document analysis and recognition. pp. 1484- 1493 ,(2013) , 10.1109/ICDAR.2013.221
Chen-Yu Lee, Anurag Bhardwaj, Wei Di, Vignesh Jagadeesh, Robinson Piramuthu, Region-Based Discriminative Feature Pooling for Scene Text Recognition computer vision and pattern recognition. pp. 4050- 4057 ,(2014) , 10.1109/CVPR.2014.516