Toward Predicting the Outcome of an A/B Experiment for Search Relevance

作者: Lihong Li , Jin Young Kim , Imed Zitouni

DOI: 10.1145/2684822.2685311

关键词:

摘要: A standard approach to estimating online click-based metrics of a ranking function is run it in controlled experiment on live users. While reliable and popular practice, configuring running an cumbersome time-intensive. In this work, inspired by recent successes offline evaluation techniques for recommender systems, we study alternative that uses historical search log reliably predict \emph{new} function, without actually To tackle novel challenges encountered Web search, variations the basic are proposed. The first take advantage diversified behavior engine over long period time simulate randomized data collection, so our can be used at very low cost. second replace exact matching (of recommended items previous work) \emph{fuzzy} result pages) increase efficiency, via better trade-off bias variance. Extensive experimental results based large-scale real from major commercial US market demonstrate promising has potential wide use search.

参考文章(33)
Gabriella Kazai, Homer Sung, Dissimilarity Based Query Selection for Efficient Preference Based IR Evaluation Lecture Notes in Computer Science. pp. 172- 183 ,(2014) , 10.1007/978-3-319-06028-6_15
Berthier A. Ribeiro-Neto, Ricardo A. Baeza-Yates, Modern Information Retrieval - the concepts and technology behind search, Second edition Pearson Education Ltd., Harlow, England. ,(2011)
Csaba Szepesvári, Rémi Munos, Lihong Li, On Minimax Optimal Offline Policy Evaluation. arXiv: Artificial Intelligence. ,(2014)
Berthier Ribeiro-Neto, Ricardo Baeza-Yates, Modern Information Retrieval: The Concepts and Technology Behind Search ,(2011)
r;ribeiro-neto bueza-yates (b), Modern Information Retrieval ,(1999)
Massimiliano Pontil, Andreas Maurer, Empirical Bernstein Bounds and Sample Variance Penalization conference on learning theory. ,(2009)
Lihong Li, Shunbao Chen, Ankur Gupta, Jim Kleban, Counterfactual Estimation and Optimization of Click Metrics for Search Engines. arXiv: Learning. ,(2014)
Aleksandr Chuklin, Pavel Serdyukov, Maarten de Rijke, Click model-based information retrieval metrics international acm sigir conference on research and development in information retrieval. pp. 493- 502 ,(2013) , 10.1145/2484028.2484071
Olivier Chapelle, Thorsten Joachims, Filip Radlinski, Yisong Yue, Large-scale validation and analysis of interleaved search evaluation ACM Transactions on Information Systems. ,vol. 30, pp. 1- 41 ,(2012) , 10.1145/2094072.2094078
Olivier Chapelle, Eren Manavoglu, Romer Rosales, Simple and Scalable Response Prediction for Display Advertising ACM Transactions on Intelligent Systems and Technology. ,vol. 5, pp. 61- ,(2014) , 10.1145/2532128