Unbiased Ranking Evaluation on a Budget

作者: Tobias Schnabel , Adith Swaminathan , Thorsten Joachims

DOI: 10.1145/2740908.2742565

关键词:

摘要: We address the problem of assessing quality a ranking system (e.g., search engine, recommender system, review ranker) given fixed budget for collecting expert judgments. In particular, we propose method that selects which items to judge in order optimize accuracy estimate. Our is not only efficient, but also provides estimates are unbiased --- unlike common approaches tend underestimate performance or have bias against new systems evaluated re-using previous relevance scores.

参考文章(13)
Marek J. Druzdzel, Changhe Yuan, How Heavy Should the Tails Be the florida ai research society. pp. 799- 805 ,(2005)
Ben Carterette, Virgil Pavlu, Evangelos Kanoulas, Javed A. Aslam, James Allan, If I Had a Million Queries Lecture Notes in Computer Science. pp. 288- 300 ,(2009) , 10.1007/978-3-642-00958-7_27
Javed A. Aslam, Virgil Pavlu, Emine Yilmaz, A statistical method for system evaluation using incomplete judgments Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06. pp. 541- 548 ,(2006) , 10.1145/1148170.1148263
Kalervo Järvelin, Jaana Kekäläinen, Cumulated gain-based evaluation of IR techniques ACM Transactions on Information Systems. ,vol. 20, pp. 422- 446 ,(2002) , 10.1145/582415.582418
Rabia Nuray, Fazli Can, Automatic ranking of retrieval systems in imperfect environments international acm sigir conference on research and development in information retrieval. pp. 379- 380 ,(2003) , 10.1145/860435.860510
Lihong Li, Jin Young Kim, Imed Zitouni, Toward Predicting the Outcome of an A/B Experiment for Search Relevance web search and data mining. pp. 37- 46 ,(2015) , 10.1145/2684822.2685311
Katja Hofmann, Shimon Whiteson, Maarten de Rijke, Estimating interleaved comparison outcomes from historical click data Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12. pp. 1779- 1783 ,(2012) , 10.1145/2396761.2398516
Sham M Kakade, Lihong Li, John Langford, Alex Strehl, Learning from Logged Implicit Exploration Data neural information processing systems. ,vol. 23, pp. 2217- 2225 ,(2010)
Emine Yilmaz, Evangelos Kanoulas, Javed A. Aslam, A simple and efficient sampling method for estimating AP and NDCG Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08. pp. 603- 610 ,(2008) , 10.1145/1390334.1390437