An Efficient Hindi Text Classification Model Using SVM

作者: Shalini Puri , Satya Prakash Singh

DOI: 10.1007/978-981-13-7150-9_24

关键词: ParsingSupport vector machineArtificial intelligenceSentenceComputer scienceNatural language processingHindiKeyword extractionClassifier (UML)Machine translationAutomatic summarization

摘要: In today’s world, several digitized Hindi text documents are generated daily at the Government sites, news portals, and public private sectors, which required to be classified effectively into various mutually exclusive pre-defined categories. As such, many text-based processing systems exist in application domains of information retrieval, machine translation, summarization, simplification, keyword extraction, other related parsing linguistic perspectives, but still, there is a wide scope classify extracted categories using classifier. this paper, Text Classification model proposed, accepts set known documents, preprocesses them document, sentence word levels, extracts features, trains SVM classifier, further classifies unknown documents. Such classification becomes challenging due its large available conjuncts letter combinations, structure, multisense words. The experiments have been performed on four two categories, by with 100% accuracy.

参考文章(37)
Amita Jain, D K Lobiyal, Unsupervised Hindi word sense disambiguation based on network agglomeration international conference on computing for sustainable global development. pp. 195- 200 ,(2015)
Shruti Tyagi, Deepti Chopra, Iti Mathur, Nisheeth Joshi, Comparison of classifier based approach with baseline approach for English-Hindi text simplification international conference on computing, communication and automation. pp. 290- 293 ,(2015) , 10.1109/CCAA.2015.7148390
Gowri Prasad, K. K. Fousiya, Named entity recognition approaches: A study applied to English and Hindi language international conference on circuits. pp. 1- 4 ,(2015) , 10.1109/ICCPCT.2015.7159443
Sarika, Dilip Kumar Sharma, A comparative analysis of Hindi word sense disambiguation and its approaches international conference on computing, communication and automation. pp. 314- 321 ,(2015) , 10.1109/CCAA.2015.7148396
Lokesh Nandanwar, Graph connectivity for unsupervised Word Sense Disambiguation for HINDI language international conference on innovations in information embedded and communication systems. pp. 1- 4 ,(2015) , 10.1109/ICIIECS.2015.7193083
Harikrishna D M, K. Sreenivasa Rao, Children story classification based on structure of the story advances in computing and communications. pp. 1485- 1490 ,(2015) , 10.1109/ICACCI.2015.7275822
Rahul Goutam, Exploring Self-Training and Co-training for Hindi Dependency Parsing Using Partial Parses international conference on asian language processing. pp. 37- 40 ,(2012) , 10.1109/IALP.2012.38
Rai Mahesh K. Sinha, Learning Recognition of Ambiguous Proper Names in Hindi international conference on machine learning and applications. ,vol. 1, pp. 178- 182 ,(2011) , 10.1109/ICMLA.2011.87
Pardeep Singh, Kamlesh Dutta, Annotating Indirect Anaphora for Hindi: A Corpus Based Study international conference on computational intelligence and communication networks. pp. 525- 529 ,(2014) , 10.1109/CICN.2014.120
Karthik Krishnamurthi, Ravi Kumar Sudi, Vijayapal Reddy Panuganti, Vishnu Vardhan Bulusu, An Empirical Evaluation of Dimensionality Reduction Using Latent Semantic Analysis on Hindi Text international conference on asian language processing. pp. 21- 24 ,(2013) , 10.1109/IALP.2013.11