作者: Aadesh Neupane
DOI: 10.5120/18799-0315
关键词: Computer science 、 Discrete cosine transform 、 Artificial intelligence 、 Consonant 、 Wavelet transform 、 Natural language processing 、 Vowel 、 Cluster analysis 、 Database 、 Nepali 、 Character (mathematics)
摘要: dataset to apply recognition algorithms and generate efficient models out of them. In case Nepali language, no such character exists for research, at least in the public domain. language has 36 consonant characters, 12 vowels each vowel can modify characters. this regard, there be total 446 characters including numeric So, manually creating requires tons effort, cost time. paper, an elegant way using semi-supervised clustering approach is described which minimizes effort Also, optimization done on existing segmentation algorithm [1] segment both handwritten scanned text. Complex features are extracted from these segmented by applying Discrete Cosine Transform Wavelet transform. Thus, used create database phash k-means cluster. Presently, contains 38,493 distributed among 52 different clusters.