Usage data in web search: benefits and limitations

作者: Ricardo Baeza-Yates , Yoelle Maarek

DOI: 10.1007/978-3-642-34109-0_2

关键词: Usage dataPersonalizationWorld Wide WebCrowdsWeb pageLink analysisGraph (abstract data type)Big dataComputer scienceCrawling

摘要: Web Search, which takes its root in the mature field of information retrieval, evolved tremendously over last 20 years. The encountered first revolution when it started to deal with huge amounts pages. Then, a major step was accomplished engines consider structure graph and link analysis became differentiator both crawling ranking. Finally, more discrete, but not less critical step, made search monitor mine numerous (mostly implicit) signals provided by users while interacting engine. We focus here on this third "revolution" large scale usage data. detail different shapes takes, illustrating benefits through review some winning features that could have been possible without it. also discuss limitations how cases even conflicts natural users' aspirations such as personalization privacy. conclude discussing these can be circumvented using adequate aggregation principles create "ad hoc" crowds.

参考文章(17)
Ricardo Baeza-Yates, Felipe Saint-Jean, A Three Level Search Engine Index Based in Query Log Distribution string processing and information retrieval. pp. 56- 65 ,(2003) , 10.1007/978-3-540-39984-1_5
Ricardo Baeza-Yates, Andrei Z. Broder, Yoelle Maarek, The new frontier of web search technology: seven challenges Search computing. pp. 3- 9 ,(2011) , 10.1007/978-3-642-19668-3_1
Eli Pariser, The Filter Bubble: What the Internet Is Hiding from You Penguin Group , The. ,(2011)
Edward Cutrell, Zhiwei Guan, What are you looking for?: an eye-tracking study of information usage in web search human factors in computing systems. pp. 407- 416 ,(2007) , 10.1145/1240624.1240690
Henry A. Feild, James Allan, Rosie Jones, Predicting searcher frustration international acm sigir conference on research and development in information retrieval. pp. 34- 41 ,(2010) , 10.1145/1835449.1835458
Karen Kukich, Techniques for automatically correcting words in text ACM Computing Surveys. ,vol. 24, pp. 377- 439 ,(1992) , 10.1145/146370.146380
Jaideep Srivastava, Robert Cooley, Mukund Deshpande, Pang-Ning Tan, Web usage mining ACM SIGKDD Explorations Newsletter. ,vol. 1, pp. 12- 23 ,(2000) , 10.1145/846183.846188
David J. Brenes, Daniel Gayo-Avello, Kilian Pérez-González, Survey and evaluation of query intent detection methods Proceedings of the 2009 workshop on Web Search Click Data - WSCD '09. pp. 1- 7 ,(2009) , 10.1145/1507509.1507510
Filip Radlinski, Susan Dumais, Improving personalized web search using result diversification Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06. pp. 691- 692 ,(2006) , 10.1145/1148170.1148320