Inferring gender of movie reviewers: exploiting writing style, content and metadata

作者: Jahna Otterbacher

DOI: 10.1145/1871437.1871487

关键词:

摘要: Despite differences in the way that men and women experience goods communicate their perspectives, online review communities typically do not provide participants' gender. We propose to infer author gender, given a set of reviews particular item, experiment on posted at Internet Movie Database (IMDb). Using logistic regression, we explore contribution three types information: 1) style, 2) content, 3) metadata (e.g. age, social feedback). Our results concur with previous research, there are salient writing style content between authored by versus women. However, comparison literary or scientific texts, which classification tasks often applied, brief occur within context an ongoing discourse. Therefore, compensative for brevity reviews, stylistic features can be augmented metadata. find perceived utility is important correlate The model incorporating all has accuracy 73.7% as sensitive length those based only features.

参考文章(36)
Darrell Laham, Thomas K Landauer, Peter W. Foltz, Automated Essay Scoring: Applications to Educational Technology EdMedia: World Conference on Educational Media and Technology. ,vol. 1999, pp. 939- 944 ,(1999)
Adrian Popescu, Gregory Grefenstette, Mining User Home Location and Gender from Flickr Tags. international conference on weblogs and social media. ,(2010)
巽 信岡, Deborah Tannen, 俊之 高垣, You Just Don't Understand ,(1990)
Christopher D. Manning, Hinrich Schütze, Foundations of Statistical Natural Language Processing ,(1999)
Gerard Salton, Michael J. McGill, Introduction to Modern Information Retrieval ,(1983)
David Goldberg, David Nichols, Brian M. Oki, Douglas Terry, Using collaborative filtering to weave an information tapestry Communications of the ACM. ,vol. 35, pp. 61- 70 ,(1992) , 10.1145/138859.138867
H. B. Mann, D. R. Whitney, On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other Annals of Mathematical Statistics. ,vol. 18, pp. 50- 60 ,(1947) , 10.1214/AOMS/1177730491
Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, Filip Radlinski, Geri Gay, Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search ACM Transactions on Information Systems. ,vol. 25, pp. 7- ,(2007) , 10.1145/1229179.1229181
Zhu Zhang, Balaji Varadarajan, Utility scoring of product reviews Proceedings of the 15th ACM international conference on Information and knowledge management - CIKM '06. pp. 51- 57 ,(2006) , 10.1145/1183614.1183626