作者: Malik Muhammad Saad Missen
DOI:
关键词:
摘要: Opinion mining is a sub-discipline within Information Retrieval (IR) and Computational Linguistics. It refers to the computational techniques for extracting, classifying, understanding, assessing opinions expressed in various online sources like news articles, social media comments, other user-generated content. also known by many terms opinion finding, detection, sentiment analysis, classification, polarity etc. Defining more specific simpler context, task of retrieving on an issue as user form query. There are problems challenges associated with field mining. In this thesis, we focus some major One foremost find specifically relevant given topic (query). A document can contain information about topics at time it possible that contains opinionated text each being discussed or only few them. Therefore, becomes very important choose topic-relevant segments their corresponding opinions. We approach problem two granularity levels, sentences passages. our first sentence-level, use semantic relations WordNet opinion-topic association. second passage-level, robust IR model (i.e., language model) problem. Basic idea behind both contributions association if textual passages) then than less segments. Most machine-learning based approaches domain-dependent performance vary from domain domain). On hand, topic-independent generalized sustain its effectiveness across different domains. However, suffer poor generally. big challenge develop which effective same time. Our thesis include development such combines simple heuristics-based topic-dependent features documents. Entity-based aims identifying entities extract them set determining relevancy itself task. proposing takes into account current article well past articles order detect most news. look local (document) global (data collection) level analyse importance assess relevance entity. Experimentation machine learning algorithm shows giving significant improvements over baseline. addition this, present framework related tasks. This exploits content evidences blogosphere tasks prediction multidimensional ranking. premature contribution lays foundations future work. Evaluation TREC Blog 2006 data collection Novelty track 2004. evaluations were performed under track.