Using Convolutional Encoder-Decoder for Document Image Binarization

作者: Xujun Peng , Huaigu Cao , Prem Natarajan

DOI: 10.1109/ICDAR.2017.121

关键词:

摘要: Document image binarization is one of the critical initial steps for document analysis and understanding. Previous work mostly focused on exploiting hand-crafted features to build statistical models distinguishing text from background. However, these approaches only achieved limited success because: (a) effectiveness by researcher's domain knowledge understanding documents, (b) a universal model cannot always capture complexity different degradations. In order address challenges, we propose convolutional encoder-decoder with deep learning in this paper. proposed method, mid-level representations are learnt stack layers, which compose encoder architecture. Then obtained mapping low resolution original size through decoder, composed series transposed layers. We compare method other algorithms both qualitatively quantitatively public dataset. The experimental results show that has comparable performance more generalization capabilities in-domain training data.

参考文章(34)
Hossein Ziaei Nafchi, Reza Farrahi Moghaddam, Mohamed Cheriet, Historical document binarization based on phase information of images international conference on computer vision. pp. 1- 12 ,(2012) , 10.1007/978-3-642-37484-5_1
J. Pastor-Pellicer, S. España-Boquera, F. Zamora-Martínez, M. Zeshan Afzal, Maria Jose Castro-Bleda, Insights on the Use of Convolutional Neural Networks for Document Image Binarization international work-conference on artificial and natural neural networks. pp. 115- 126 ,(2015) , 10.1007/978-3-319-19222-2_10
J. Sauvola, T. Seppanen, S. Haapakoski, M. Pietikainen, Adaptive document binarization international conference on document analysis and recognition. ,vol. 1, pp. 147- 152 ,(1997) , 10.1109/ICDAR.1997.619831
Hyeonwoo Noh, Seunghoon Hong, Bohyung Han, Learning Deconvolution Network for Semantic Segmentation international conference on computer vision. pp. 1520- 1528 ,(2015) , 10.1109/ICCV.2015.178
Abdelâali Hassaïne, Somaya Al-Maadeed, Ahmed Bouridane, A Set of Geometrical Features for Writer Identification Neural Information Processing. pp. 584- 591 ,(2012) , 10.1007/978-3-642-34500-5_69
Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation computer vision and pattern recognition. pp. 3431- 3440 ,(2015) , 10.1109/CVPR.2015.7298965
Xujun Peng, Srirangaraj Setlur, Venu Govindaraju, Ramachandrula Sitaram, Binarization of camera-captured document using A MAP approach document recognition and retrieval. ,vol. 7874, ,(2011) , 10.1117/12.874091
Thibault Lelore, Frederic Bouchara, Super-Resolved Binarization of Text Based on the FAIR Algorithm international conference on document analysis and recognition. pp. 839- 843 ,(2011) , 10.1109/ICDAR.2011.172
Amjad Rehman, Tanzila Saba, Neural networks for document image preprocessing: state of the art Artificial Intelligence Review. ,vol. 42, pp. 253- 273 ,(2014) , 10.1007/S10462-012-9337-Z