On the Modification of Binarization Algorithms to Retain Grayscale Information for Handwritten Text Recognition

作者: Mauricio Villegas , Verónica Romero , Joan Andreu Sánchez

DOI: 10.1007/978-3-319-19390-8_24

关键词: Speech recognitionDigital librarySearch engine indexingInformation retrievalIntelligent word recognitionGrayscaleText recognitionComputer science

摘要: The amount of digitized legacy documents has been rising over the last years due mainly to increasing number on-line digital libraries publishing this kind documents. vast majority them remain waiting be transcribed provide historians and other researchers new ways indexing, consulting querying them. However, performance accuracy state-of-the-art Handwritten Text Recognition techniques decreases dramatically when they are applied these historical This is typical paper degradation problems. Therefore, robust pre-processing an important step for helping further recognition steps. proposes take existing binarization techniques, in order retain their advantages, modify such a way that some original grayscale information preserved considered by subsequent recognizer. Results reported with publicly available ESPOSALLES database.

参考文章(12)
R. Kneser, H. Ney, Improved backing-off for M-gram language modeling international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 181- 184 ,(1995) , 10.1109/ICASSP.1995.479394
Khurram Khurshid, Imran Siddiqi, Claudie Faure, Nicole Vincent, Comparison of Niblack inspired binarization methods for ancient documents document recognition and retrieval. ,vol. 7247, ,(2009) , 10.1117/12.805827
F. Drira, Towards restoring historic documents degraded over time Second International Conference on Document Image Analysis for Libraries (DIAL'06). pp. 350- 357 ,(2006) , 10.1109/DIAL.2006.43
Faisal Shafait, Daniel Keysers, Thomas M. Breuel, Efficient implementation of local adaptive thresholding techniques using integral images document recognition and retrieval. ,vol. 6815, pp. 681510- ,(2008) , 10.1117/12.767755
Verónica Romero, Alicia Fornés, Nicolás Serrano, Joan Andreu Sánchez, Alejandro H. Toselli, Volkmar Frinken, Enrique Vidal, Josep Lladós, The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition Pattern Recognition. ,vol. 46, pp. 1658- 1669 ,(2013) , 10.1016/J.PATCOG.2012.11.024
A. H. TOSELLI, A. JUAN, J. GONZÁLEZ, I. SALVADOR, E. VIDAL, F. CASACUBERTA, D. KEYSERS, H. NEY, INTEGRATED HANDWRITING RECOGNITION AND INTERPRETATION USING FINITE-STATE MODELS International Journal of Pattern Recognition and Artificial Intelligence. ,vol. 18, pp. 519- 539 ,(2004) , 10.1142/S0218001404003344
U.-V. MARTI, H. BUNKE, Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems International Journal of Pattern Recognition and Artificial Intelligence. ,vol. 15, pp. 65- 90 ,(2001) , 10.1142/S0218001401000848
A. Graves, M. Liwicki, S. Fernandez, R. Bertolami, H. Bunke, J. Schmidhuber, A Novel Connectionist System for Unconstrained Handwriting Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 31, pp. 855- 868 ,(2009) , 10.1109/TPAMI.2008.137