Named-Entity Recognition in Bengali

作者: Apurbalal Senapati , Arjun Das , Utpal Garain

DOI: 10.1145/2701336.2701647

关键词:

摘要: This paper describes two systems for Named Entity Recognition (NER) and performance of has been compared. The first system is a rule-based one whereas the second statistical (based on CRF) in nature. vary some other aspects too, example, works untagged data (not even POS tag done) to identify NER makes use tagger chunker. rules used by are mined from training data. CRF-based classification does not require any explicit linguistic but it uses gazetteer built Wiki sources.

参考文章(11)
Taku Kudoh, Yuji Matsumoto, Use of support vector learning for chunk identification Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning -. pp. 142- 144 ,(2000) , 10.3115/1117601.1117635
Rejwanul Haque, Sivaji Bandyopadhyay, Asif Ekbal, Venkateswarlu Poka, Amitava Das, Language Independent Named Entity Recognition in Indian Languages international joint conference on natural language processing. pp. 33- 40 ,(2008)
Douglas Oard, Utpal Garain, Arjun Das, David Doermann, Leveraging Statistical Transliteration for Dictionary-Based English-Bengali CLIR of OCR'd Text international conference on computational linguistics. pp. 339- 348 ,(2012)
John D. Lafferty, Andrew McCallum, Fernando C. N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data international conference on machine learning. pp. 282- 289 ,(2001)
Praneeth Shishtla, Karthik Gali, Dipti Misra Sharma, Ashwini Vaidya, Harshit Surana, Aggregating Machine Learning and Rule Based Heuristics for Named Entity Recognition international joint conference on natural language processing. pp. 25- 32 ,(2008)
Pabitra Mitra, Sandipan Dandapat, Sudeshna Sarkar, Sujan Kumar Saha, Sanjay Chatterji, A Hybrid Named Entity Recognition System for South and South East Asian Languages international joint conference on natural language processing. pp. 17- 24 ,(2008)
Sivaji Bandyopadhyay, Asif Ekbal, A Conditional Random Field Approach for Named Entity Recognition in Bengali and Hindi Linguistic Issues in Language Technology. ,vol. 2, ,(2009)
Bidyut Baran Chaudhuri, Suvankar Bhattacharya, An Experiment on Automatic Detection of Named Entities in Bangla international joint conference on natural language processing. pp. 75- 82 ,(2008)
Sivaji Bandyopadhyay, Asif Ekbal, Named Entity Recognition Using Appropriate Unlabeled Data, Post-processing and Voting Informatica (lithuanian Academy of Sciences). ,vol. 34, pp. 55- 76 ,(2010)
Kristina Toutanova, Christopher D. Manning, Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger empirical methods in natural language processing. pp. 63- 70 ,(2000) , 10.3115/1117794.1117802