Building Tagged Linguistic Unit Databases for Sentiment Detection

作者: Albert Weichselbraun , Arno Scharl , Stefan Gindl

DOI:

关键词: Artificial intelligenceAmbiguityNatural language processingAdverbNounSet (abstract data type)LinguisticsVerbParsingComputer scienceText processingDatabaseGrammatical category

摘要: Despite the obvious business value of visualizing similarities between ele- ments evolving information spaces and mapping these e.g. onto geospa- tial reference systems, analysts are often more interested in how semantic orien- tation (sentiment) towards an organization, a product or particular technology is changing over time. Unfortunately, popular methods that process unstructured tex- tual material to detect orientation automatically based on tagged dictionar- ies (Scharl et al. 2003) not capable fulfilling this task, even when coupled with part-of-speech tagging, standard component most text processing toolkits distinguishes grammatical categories such as article (AT), noun (NN), verb (VB), adverb (RB). Small corpus size, ambiguity subtle incremental change tonal expressions different versions document complicate detection se- mantic prevent promising algorithms from being incorporated into commercial applications. Parsing structures, by contrast, outper- forms dictionary-based approaches terms reliability, but usually suffers poor scalability due their computational complexity. This paper addresses predica- ment presenting alternative approach building Tagged Linguistic Unit (TLU) databases overcome restrictions dictionaries limited set tokens.

参考文章(15)
Kushal Dave, Steve Lawrence, David M. Pennock, Mining the peanut gallery Proceedings of the twelfth international conference on World Wide Web - WWW '03. pp. 519- 528 ,(2003) , 10.1145/775152.775226
Albert Weichselbraun, Elizabeth Chang, Wei Liu, Arno Scharl, Semi-Automatic Ontology Extension Using Spreading Activation I-KNOW`05: 5th International Conference on Knowledge Management. pp. 145- 153 ,(2005)
Bo Pang, Lillian Lee, Shivakumar Vaithyanathan, Thumbs up? Sentiment Classification using Machine Learning Techniques empirical methods in natural language processing. pp. 79- 86 ,(2002) , 10.3115/1118693.1118704
Albert Weichselbraun, Arno Scharl, Web Coverage of the 2004 US Presidential Election conference of the european chapter of the association for computational linguistics. pp. 35- 42 ,(2006)
Adam Kilgarriff, Pavel Rychlý, Pavel Smrz, David Tugwell, The Sketch Engine Proceedings of the 11th EURALEX International Congress. pp. 105- 115 ,(2004)
Nigel Collier, Tony Mullen, Sentiment Analysis using Support Vector Machines with Diverse Information Sources empirical methods in natural language processing. pp. 412- 418 ,(2004)
Arno Scharl, Irene Pollach, Christian Bauer, Determining the Semantic Orientation of Web-Based Corpora intelligent data engineering and automated learning. pp. 840- 849 ,(2003) , 10.1007/978-3-540-45080-1_116
Adam Kilgarriff, Pavel Rychlý, Pavel Smrž, David Tugwell, The Sketch Engine Proceedings of the Corpus Linguistics Conference 2009 (CL2009),, 2009, pág. 177. pp. 105- 116 ,(2004)
Adam Kilgarriff, Roger Evans, Rob Koeling, Michael Rundell, David Tugwell, WASPBENCH: a lexicographer's workbench supporting state-of-the-art word sense disambiguation conference of the european chapter of the association for computational linguistics. pp. 211- 214 ,(2003) , 10.3115/1067737.1067787