Combining Granularity-based Topic-Dependent and Topic-Independent Evidences for Opinion Detection

作者: Malik Muhammad Saad Missen

DOI:

关键词:

摘要: Opinion mining is a sub-discipline within Information Retrieval (IR) and Computational Linguistics. It refers to the computational techniques for extracting, classifying, understanding, assessing opinions expressed in various online sources like news articles, social media comments, other user-generated content. also known by many terms opinion finding, detection, sentiment analysis, classification, polarity etc. Defining more specific simpler context, task of retrieving on an issue as user form query. There are problems challenges associated with field mining. In this thesis, we focus some major One foremost find specifically relevant given topic (query). A document can contain information about topics at time it possible that contains opinionated text each being discussed or only few them. Therefore, becomes very important choose topic-relevant segments their corresponding opinions. We approach problem two granularity levels, sentences passages. our first sentence-level, use semantic relations WordNet opinion-topic association. second passage-level, robust IR model (i.e., language model) problem. Basic idea behind both contributions association if textual passages) then than less segments. Most machine-learning based approaches domain-dependent performance vary from domain domain). On hand, topic-independent generalized sustain its effectiveness across different domains. However, suffer poor generally. big challenge develop which effective same time. Our thesis include development such combines simple heuristics-based topic-dependent features documents. Entity-based aims identifying entities extract them set determining relevancy itself task. proposing takes into account current article well past articles order detect most news. look local (document) global (data collection) level analyse importance assess relevance entity. Experimentation machine learning algorithm shows giving significant improvements over baseline. addition this, present framework related tasks. This exploits content evidences blogosphere tasks prediction multidimensional ranking. premature contribution lays foundations future work. Evaluation TREC Blog 2006 data collection Novelty track 2004. evaluations were performed under track.

参考文章(208)
Richard McCreadie, Rodrygo L. T. Santos, Craig Macdonald, Iadh Ounis, Jie Peng, University of Glasgow at TREC 2009: Experiments with Terrier text retrieval conference. ,(2009)
Pranam Kolari, Akshay Java, Tim Finin, Justin Martineau, Anupam Joshi, James Mayfield, The BlogVox Opinion Retrieval System text retrieval conference. ,(2007)
Julie Beth Lovins, Development of a Stemming Algorithm Mech. Transl. Comput. Linguistics. ,vol. 11, pp. 22- 31 ,(1968)
Andrea Esuli, Fabrizio Sebastiani, SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining language resources and evaluation. pp. 417- 422 ,(2006)
Nick Craswell, Ian Soboroff, Arjen P. de Vries, Overview of the TREC 2006 Enterprise Track. text retrieval conference. ,(2006)
Jordi Atserias, Giuseppe Attardi, Hugo Zaragoza, Massimiliano Ciaramita, Semantically Annotated Snapshot of the English Wikipedia. language resources and evaluation. ,(2008)
Xueqi Cheng, Guodong Ding, Songbo Tan, Donglin Cao, Xiangwen Liao, Yue Liu, Combining Language Model with Sentiment Analysis for Opinion Retrieval of Blog-Post. text retrieval conference. ,(2006)
Donna Harman, Overview of the TREC 2002 Novelty Track. text retrieval conference. ,(2002)
Kent D. Peterson, Positive or Negative. Journal of Staff Development. ,vol. 23, pp. 10- 15 ,(2002)
Nick Craswell, Ian Soboroff, Peter Bailey, Arjen P. de Vries, Overview of the TREC 2007 Enterprise Track. text retrieval conference. ,(2007)