Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

作者： Julian Ibarz , Ian Goodfellow , Sacha Arnoud , Vinay Shet , Yaroslav Bulatov

DOI:

关键词: Artificial neural network 、 Segmentation 、 Convolutional neural network 、 Image (mathematics) 、 Artificial intelligence 、 Computer science 、 Domain (software engineering) 、 Pattern recognition

摘要: Recognizing arbitrary multi-character text in unconstrained natural photographs is a hard problem. In this paper, we address an equally sub-problem domain viz. recognizing multi-digit numbers from Street View imagery. Traditional approaches to solve problem typically separate out the localization, segmentation, and recognition steps. paper propose unified approach that integrates these three steps via use of deep convolutional neural network operates directly on image pixels. We employ DistBelief implementation networks order train large, distributed high quality images. find performance increases with depth network, best occurring deepest architecture trained, eleven hidden layers. evaluate publicly available SVHN dataset achieve over $96\%$ accuracy complete street numbers. show per-digit task, improve upon state-of-the-art, achieving $97.84\%$ accuracy. also even more challenging generated imagery containing several tens millions number annotations $90\%$ To further explore applicability proposed system broader tasks, apply it synthetic distorted reCAPTCHA. reCAPTCHA one most secure reverse turing tests uses distinguish humans bots. report $99.8\%$ hardest category Our evaluations both tasks indicate at specific operating thresholds, comparable to, some cases exceeds, human operators.

参考文章(14)

Erkki Oja, Aapo Hyvarinen, Juha Karhunen, Independent Component Analysis ,(2001)

Joelle Pineau, Ouais Alsharif, End-to-End Text Recognition with Hybrid HMM Maxout Models arXiv: Computer Vision and Pattern Recognition. ,(2013)

W. L. Buntine, Operations for learning with graphical models Journal of Artificial Intelligence Research. ,vol. 2, pp. 159- 225 ,(1994) , 10.1613/JAIR.62

Ilya Sutskever, Geoffrey E. Hinton, Alex Krizhevsky, Ruslan R. Salakhutdinov, Nitish Srivastava, Improving neural networks by preventing co-adaptation of feature detectors arXiv: Neural and Evolutionary Computing. ,(2012)

Kunihiko Fukushima, Neocognitron: A Self Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position Biological Cybernetics. ,vol. 36, pp. 193- 202 ,(1980) , 10.1007/BF00344251

Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition Proceedings of the IEEE. ,vol. 86, pp. 2278- 2324 ,(1998) , 10.1109/5.726791

Christian Szegedy, Alexander Toshev, Dumitru Erhan, Deep Neural Networks for Object Detection neural information processing systems. ,vol. 26, pp. 2553- 2561 ,(2013)

Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc'aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Quoc Le, Andrew Ng, None, Large Scale Distributed Deep Networks neural information processing systems. ,vol. 25, pp. 1223- 1231 ,(2012)

Matthew D Zeiler, Rob Fergus, Visualizing and Understanding Convolutional Neural Networks ,(2013)

10.

Yuval Netzer, Andrew Y. Ng, Adam Coates, Alessandro Bissacco, Tao Wang, Bo Wu, Reading Digits in Natural Images with Unsupervised Feature Learning ,(2011)

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

来源期刊

我的账户

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

来源期刊

相似文章 10

我的账户