Attention-based Fully Gated CNN-BGRU for Russian Handwritten Text

作者: Abdelrahman Abdallah , Mohamed Hamada , Daniyar Nurseitov

DOI: 10.3390/JIMAGING6120141

关键词: Character (computing)Feature (machine learning)SequenceKazakhTask (computing)Word error rateSpeech recognitionArtificial neural networkComputer science

摘要: This research approaches the task of handwritten text with attention encoder-decoder networks that are trained on Kazakh and Russian language. We developed a novel deep neural network model based Fully Gated CNN, supported by Multiple bidirectional GRU Attention mechanisms to manipulate sophisticated features achieve 0.045 Character Error Rate (CER), 0.192 Word (WER) 0.253 Sequence (SER) for first test dataset 0.064 CER, 0.24 WER 0.361 SER second dataset. Also, we propose fully gated layers taking advantage multiple output feature from Tahn input feature, this proposed work achieves better results experimented our Handwritten & Database (HKR). Our is HKR demonstrates state-of-the-art most other existing models.

参考文章(49)
Volkmar Frinken, Horst Bunke, Continuous Handwritten Script Recognition. Handbook of Document Image Processing and Recognition. pp. 391- 425 ,(2014)
Yoshua Bengio, Dmitriy Serdyuk, Jan Chorowski, Kyunghyun Cho, Dzmitry Bahdanau, Attention-based models for speech recognition neural information processing systems. ,vol. 28, pp. 577- 585 ,(2015)
Frank Wilcoxon, Individual Comparisons by Ranking Methods Springer Series in Statistics. ,vol. 1, pp. 196- 202 ,(1992) , 10.1007/978-1-4612-4380-9_16
Lalit R. Bahl, Frederick Jelinek, Robert L. Mercer, A Maximum Likelihood Approach to Continuous Speech Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. PAMI-5, pp. 179- 190 ,(1983) , 10.1109/TPAMI.1983.4767370
H. Bunke, M. Roth, E.G. Schukat-Talamazzini, Off-line cursive handwriting recognition using hidden Markov models Pattern Recognition. ,vol. 28, pp. 1399- 1413 ,(1995) , 10.1016/0031-3203(95)00013-P
Ralph B. D'agostino, Albert Belanger, Ralph B. D'agostino, A Suggestion for Using Powerful and Informative Tests of Normality The American Statistician. ,vol. 44, pp. 316- 321 ,(1990) , 10.1080/00031305.1990.10475751
Andreas Fischer, Andreas Keller, Volkmar Frinken, Horst Bunke, Lexicon-free handwritten word spotting using character HMMs Pattern Recognition Letters. ,vol. 33, pp. 934- 942 ,(2012) , 10.1016/J.PATREC.2011.09.009
Basilis Gatos, Georgios Louloudis, Tim Causer, Kris Grint, Veronica Romero, Joan Andreu Sanchez, Alejandro H. Toselli, Enrique Vidal, Ground-Truth Production in the Transcriptorium Project document analysis systems. pp. 237- 241 ,(2014) , 10.1109/DAS.2014.23
Kai-Fu Lee, Hsiao-Wuen Hon, Mei-Yuh Hwang, Xuedong Huang, Speech recognition using hidden Markov models: a CMU perspective Speech Communication. ,vol. 9, pp. 497- 508 ,(1990) , 10.1016/0167-6393(90)90025-5
M. Mohamed, P. Gader, Handwritten word recognition using segmentation-free hidden Markov modeling and segmentation-based dynamic programming techniques IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 18, pp. 548- 554 ,(1996) , 10.1109/34.494644