作者: Ladislav Lenc , Jiří Martínek , Pavel Král
DOI: 10.1007/978-3-030-19823-7_29
关键词:
摘要: This work aims at data preparation for OCR systems based on recurrent neural networks. Precisely annotated are necessary training a network as well evaluation of methods. It is possible to synthesize the data, however such not that realistic real ones. Manual annotation thus still needed in many cases, especially case historical documents we focusing on. Although there several complex document processing, best our knowledge, simple tool completely missing. Therefore, propose and implement set tools utilizing artificial intelligence simplify process. These create ground truths line images used nowadays systems. Another contribution this paper making these freely available research purposes.