作者: Smruthi Mukund , Rohini Srihari , Debanjan Ghosh
DOI:
关键词: Parsing 、 Natural language processing 、 Context (language use) 、 Sequence 、 Urdu 、 Sentiment analysis 、 Computer science 、 Artificial intelligence 、 Noun 、 Word (computer architecture) 、 Grammatical category
摘要: Automatic extraction of opinion holders and targets (together referred to as entities) is an important subtask sentiment analysis. In this work, we attempt accurately extract entities from Urdu newswire. Due the lack resources required for training role labelers dependency parsers (as in English) Urdu, a more robust approach based on (i) generating candidate word sequences corresponding entities, (ii) subsequently disambiguating these or presented. Detecting boundaries such very different than English since grammatical categories tense, gender case are captured inflections. exploit morphological inflections associated with nouns verbs correctly identify sequence boundaries. Different levels information that capture context encoded train standard linear kernels. To end best performance obtained entity detection analysis 58.06% F-Score using kernels 61.55% combination