Dissimilarity Based Query Selection for Efficient Preference Based IR Evaluation

作者： Gabriella Kazai , Homer Sung

关键词:

摘要: The evaluation of Information Retrieval IR systems has recently been exploring the use preference judgments over two lists search results, presented side-by-side to judges. Such have shown capture a richer set relevance criteria than traditional methods collecting labels per single document. However, are expensive obtain and less reusable as any change either side necessitates new judgment. In this paper, we propose way measure dissimilarity between sides in experiments show how can be used prioritize queries judged an offline setting. Our proposed measure, referred Weighted Ranking Difference WRD, takes into account both ranking differences similarity documents across sides, where document may, for example, URL or query suggestion. We empirically evaluate our on large-scale, real-world dataset crowdsourced ranked auto-completion suggestions. that WRD score is indicative probability tie can, average, save 25% judging resources.

参考文章(20)

Advances in Information Retrieval Theory Lecture Notes in Computer Science. ,vol. 5766, ,(2009) , 10.1007/978-3-642-04417-5

Mehdi Hosseini, Ingemar J. Cox, Natasa Milic-Frayling, Vishwa Vinay, Trevor Sweeting, Selecting a subset of queries for acquisition of further relevance judgements international conference on the theory of information retrieval. pp. 113- 124 ,(2011) , 10.1007/978-3-642-23318-0_12

Emine Yilmaz, Javed A. Aslam, Stephen Robertson, A new rank correlation coefficient for information retrieval Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08. pp. 587- 594 ,(2008) , 10.1145/1390334.1390435

Benjamin Piwowarski, Andrew Trotman, Mounia Lalmas, Sound and complete relevance assessment for XML retrieval ACM Transactions on Information Systems. ,vol. 27, pp. 1- 37 ,(2008) , 10.1145/1416950.1416951

John Guiver, Stefano Mizzaro, Stephen Robertson, A few good topics: Experiments in topic set reduction for retrieval evaluation ACM Transactions on Information Systems. ,vol. 27, pp. 21- ,(2009) , 10.1145/1629096.1629099

Grace S. Shieh, A weighted Kendall's tau statistic Statistics & Probability Letters. ,vol. 39, pp. 17- 24 ,(1998) , 10.1016/S0167-7152(98)00006-6

Javed A. Aslam, Virgil Pavlu, Emine Yilmaz, A statistical method for system evaluation using incomplete judgments Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06. pp. 541- 548 ,(2006) , 10.1145/1148170.1148263

Jianhan Zhu, Jun Wang, Vishwa Vinay, Ingemar J. Cox, Topic (query) selection for IR evaluation international acm sigir conference on research and development in information retrieval. pp. 802- 803 ,(2009) , 10.1145/1571941.1572136

William Webber, Alistair Moffat, Justin Zobel, A similarity measure for indefinite rankings ACM Transactions on Information Systems. ,vol. 28, pp. 20- ,(2010) , 10.1145/1852102.1852106

10.

Jinyoung Kim, Gabriella Kazai, Imed Zitouni, Relevance dimensions in preference-based IR evaluation international acm sigir conference on research and development in information retrieval. pp. 913- 916 ,(2013) , 10.1145/2484028.2484168

Dissimilarity Based Query Selection for Efficient Preference Based IR Evaluation

来源期刊

我的账户

Dissimilarity Based Query Selection for Efficient Preference Based IR Evaluation

来源期刊

相似文章 4

Toward Predicting the Outcome of an A/B Experiment for Search Relevance

Identifying Developers' Expertise in Social Coding Platforms

Intelligent topic selection for low-cost information retrieval evaluation: A New perspective on deep vs. shallow judging

Intelligent Topic Selection for Low-Cost Information Retrieval Evaluation: A New Perspective on Deep vs. Shallow Judging

我的账户