Building subjectivity lexicon(s) from scratch for essay data

作者: Beata Beigman Klebanov , Jill Burstein , Nitin Madnani , Adam Faulkner , Joel Tetreault

DOI: 10.1007/978-3-642-28604-9_48

关键词: Process (engineering)ParaphraseLexiconSubjectivitySentiment analysisComputer scienceNatural language processingScratchArtificial intelligenceLinguistics

摘要: While there are a number of subjectivity lexicons available for research purposes, none can be used commercially. We describe the process constructing lexicon(s) recognizing sentiment polarity in essays written by test-takers, to within commercial essay-scoring system. discuss ways expanding manually-built seed lexicon using dictionary-based, distributional in-domain and out-of-domain information, as well Amazon Mechanical Turk help "clean up" expansions. show feasibility family from scratch combination methods attain competitive performance with state-of-art research-only lexicons. Furthermore, this is first use, our knowledge, paraphrase generation system lexicon.

参考文章(55)
Alina Andreevskaia, Sabine Bergler, Mining WordNet for a Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses conference of the european chapter of the association for computational linguistics. pp. 209- 216 ,(2006)
Marta R. Costa-Jussà, Rafael Banchs, Jens Grivolla, Francesc Benavent, Joan Codina, Bart Mellebeek, Opinion Mining of Spanish Customer Comments with Non-Expert Annotations on Mechanical Turk north american chapter of the association for computational linguistics. pp. 114- 121 ,(2010)
Andrea Esuli, Fabrizio Sebastiani, Determining Term Subjectivity and Term Orientation for Opinion Mining conference of the european chapter of the association for computational linguistics. pp. 193- 200 ,(2006)
Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)
Computing Attitude and Affect in Text: Theory and Applications Computing Attitude and Affect in Text: Theory and Applications. pp. 341- 341 ,(2014) , 10.1007/1-4020-4102-0
Donald Metzler, Susan Dumais, Christopher Meek, Similarity measures for short segments of text european conference on information retrieval. pp. 16- 27 ,(2007) , 10.1007/978-3-540-71496-5_5
Marianne Lykke, Birger Larsen, Haakon Lund, Peter Ingwersen, Developing a Test Collection for the Evaluation of Integrated Search Lecture Notes in Computer Science. pp. 627- 630 ,(2010) , 10.1007/978-3-642-12275-0_63
Maarten de Rijke, Jaap Kamps, Maarten Marx, Robert J. Mokken, Using WordNet to measure semantic orientations of adjectives language resources and evaluation. pp. 1115- 1118 ,(2004)
Rion Snow, Brendan O'Connor, Daniel Jurafsky, Andrew Y. Ng, Cheap and fast---but is it good? Proceedings of the Conference on Empirical Methods in Natural Language Processing - EMNLP '08. pp. 254- 263 ,(2008) , 10.3115/1613715.1613751