作者: Lihong Li , Jin Young Kim , Imed Zitouni
关键词:
摘要: A standard approach to estimating online click-based metrics of a ranking function is run it in controlled experiment on live users. While reliable and popular practice, configuring running an cumbersome time-intensive. In this work, inspired by recent successes offline evaluation techniques for recommender systems, we study alternative that uses historical search log reliably predict \emph{new} function, without actually To tackle novel challenges encountered Web search, variations the basic are proposed. The first take advantage diversified behavior engine over long period time simulate randomized data collection, so our can be used at very low cost. second replace exact matching (of recommended items previous work) \emph{fuzzy} result pages) increase efficiency, via better trade-off bias variance. Extensive experimental results based large-scale real from major commercial US market demonstrate promising has potential wide use search.