作者: Reinhold Huber-Mork , Alexander Schindler
DOI: 10.1109/ISPA.2013.6703735
关键词:
摘要: We describe a method for defect detection and classification collections of digital images historical book documents. Undistorted text from various books characterized by strong variation language, font layout properties are discriminated typical errors in digitization processes such as occlusion an operator's hand, visible edge or image warping artifacts. A bag local features approach is compared to global characterization location, size orientation detected keypoints. Machine learning used discriminate between those classes. Results different the task discrimination undistorted major distortion class which presence where based on derived histograms achieved cross-validation accuracy better than 99 percent representative data set. Taking into account up three classes distortions still resulted accuracies beyond 90 using visual classifier input.