Method of identifying script of line of text

作者: Carson S. Cumbee

DOI:

关键词:

摘要: A method of identifying the script a line text by first assigning weight to each n-gram in group documents known scripts, where is sequence numbers representing k-mean cluster centroids which character segments scripts most closely match. identified, made up pixels. The identified cropped so that only percentage pixels remain. vertically and horizontally rescaled into gray-scale vertical are replaced with number k-means centroid it matches. n-grams represents scored against weights text. highest score compared scores scripts. determined be document highest.

参考文章(26)
Axel San Jose Wernicke, Rainer W. Santa Clara Lienhart, Generalized text localization in images ,(2001)
Dan S. Bloomberg, Francine R. Chen, Lynn D. Wilcox, Word spotting in bitmap images using text line bounding boxes and hidden Markov models ,(1994)
Daniel P. Huttenlocher, Eric W. Jaquith, Method for identifying word bounding boxes in text ,(1993)
Lori Lynn Barski, Roger Stephen Gaborski, Method and apparatus for cursive script recognition ,(1992)
Robert Charles Paulsen, Michael John Martino, Kiosk for multiple spoken languages ASAJ. ,vol. 109, pp. 29- ,(1997)