作者: Shalini Puri , Satya Prakash Singh
DOI: 10.1007/978-981-13-7150-9_24
关键词: Parsing 、 Support vector machine 、 Artificial intelligence 、 Sentence 、 Computer science 、 Natural language processing 、 Hindi 、 Keyword extraction 、 Classifier (UML) 、 Machine translation 、 Automatic summarization
摘要: In today’s world, several digitized Hindi text documents are generated daily at the Government sites, news portals, and public private sectors, which required to be classified effectively into various mutually exclusive pre-defined categories. As such, many text-based processing systems exist in application domains of information retrieval, machine translation, summarization, simplification, keyword extraction, other related parsing linguistic perspectives, but still, there is a wide scope classify extracted categories using classifier. this paper, Text Classification model proposed, accepts set known documents, preprocesses them document, sentence word levels, extracts features, trains SVM classifier, further classifies unknown documents. Such classification becomes challenging due its large available conjuncts letter combinations, structure, multisense words. The experiments have been performed on four two categories, by with 100% accuracy.