Using Sequence Kernels to identify Opinion Entities in Urdu

作者: Smruthi Mukund , Rohini Srihari , Debanjan Ghosh

DOI:

关键词: ParsingNatural language processingContext (language use)SequenceUrduSentiment analysisComputer scienceArtificial intelligenceNounWord (computer architecture)Grammatical category

摘要: Automatic extraction of opinion holders and targets (together referred to as entities) is an important subtask sentiment analysis. In this work, we attempt accurately extract entities from Urdu newswire. Due the lack resources required for training role labelers dependency parsers (as in English) Urdu, a more robust approach based on (i) generating candidate word sequences corresponding entities, (ii) subsequently disambiguating these or presented. Detecting boundaries such very different than English since grammatical categories tense, gender case are captured inflections. exploit morphological inflections associated with nouns verbs correctly identify sequence boundaries. Different levels information that capture context encoded train standard linear kernels. To end best performance obtained entity detection analysis 58.06% F-Score using kernels 61.55% combination

参考文章(25)
Andrea Esuli, Fabrizio Sebastiani, SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining language resources and evaluation. pp. 417- 422 ,(2006)
Claire Cardie, Veselin Stoyanov, Annotating Topics of Opinions. language resources and evaluation. ,(2008)
Vasileios Hatzivassiloglou, Dan Jurafsky, Ashley Thornton, Steven Bethard, Hong Yu, Automatic Extraction of Opinion Propositions and their Holders ,(2004)
Text classification using string kernels Journal of Machine Learning Research. ,vol. 2, pp. 419- 444 ,(2002) , 10.1162/153244302760200687
Nello Cristianini, John Shawe-Taylor, Kernel Methods for Pattern Analysis ,(2004)
Vladimir Naumovich Vapnik, Estimation of Dependences Based on Empirical Data ,(2010)
Dietrich Klakow, Michael Wiegand, Convolution Kernels for Opinion Holder Extraction north american chapter of the association for computational linguistics. pp. 795- 803 ,(2010)
Shlomo Argamon, Sterling Stuart Stein, Kenneth Bloom, Appraisal Extraction for News Opinion Analysis at NTCIR-6 NTCIR. ,(2007)
Smruthi Mukund, Rohini Srihari, A Vector Space Model for Subjectivity Classification in Urdu aided by Co-Training international conference on computational linguistics. pp. 860- 868 ,(2010)