Deep Features for Text Spotting

作者: Max Jaderberg , Andrea Vedaldi , Andrew Zisserman

DOI: 10.1007/978-3-319-10593-2_34

关键词:

摘要: The goal of this work is text spotting in natural images. This divided into two sequential tasks: detecting words regions the image, and recognizing within these regions. We make following contributions: first, we develop a Convolutional Neural Network (CNN) classifier that can be used for both tasks. CNN has novel architecture enables efficient feature sharing (by using number layers common) detection, character case-sensitive insensitive classification, bigram classification. It exceeds state-of-the-art performance all these. Second, technical changes over traditional architectures, including no downsampling per-pixel sliding window, multi-mode learning with mixture linear models (maxout). Third, have method automated data mining Flickr, generates word level annotations. Finally, components are together to form an end-to-end, system. evaluate text-spotting system on standard benchmarks, ICDAR Robust Reading set Street View Text set, demonstrate improvements multiple measures.

参考文章(51)
Lukas Neumann, Jiri Matas, A method for text localization and recognition in real-world images asian conference on computer vision. pp. 770- 783 ,(2010) , 10.1007/978-3-642-19318-7_60
Kai Wang, Serge Belongie, Word spotting in the wild european conference on computer vision. pp. 591- 604 ,(2010) , 10.1007/978-3-642-15549-9_43
Radim Šára, Radim Tyleček, A weak structure model for regular pattern recognition applied to facade images asian conference on computer vision. pp. 450- 463 ,(2010) , 10.1007/978-3-642-19315-6_35
Tatiana Novikova, Olga Barinova, Pushmeet Kohli, Victor Lempitsky, Large-lexicon attribute-consistent text recognition in natural images european conference on computer vision. pp. 752- 765 ,(2012) , 10.1007/978-3-642-33783-3_54
Manik Varma, Teófilo Emídio de Campos, Bodla Rakesh Babu, CHARACTER RECOGNITION IN NATURAL IMAGES international conference on computer vision theory and applications. pp. 273- 280 ,(2009)
Andrew Y. Ng, Adam Coates, David J. Wu, Tao Wang, End-to-end text recognition with convolutional neural networks international conference on pattern recognition. pp. 3304- 3308 ,(2012)
Ian J Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, Vinay Shet, None, Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks international conference on learning representations. ,(2014)
Ilya Sutskever, Geoffrey E. Hinton, Alex Krizhevsky, Ruslan R. Salakhutdinov, Nitish Srivastava, Improving neural networks by preventing co-adaptation of feature detectors arXiv: Neural and Evolutionary Computing. ,(2012)
Vibhor Goel, Anand Mishra, Karteek Alahari, C.V. Jawahar, Whole is Greater than Sum of Parts: Recognizing Scene Text Words international conference on document analysis and recognition. pp. 398- 402 ,(2013) , 10.1109/ICDAR.2013.87
Anand Mishra, Karteek Alahari, Cv Jawahar, Scene Text Recognition using Higher Order Language Priors british machine vision conference. pp. 1- 11 ,(2009) , 10.5244/C.26.127