作者: Nick Craswell , Hamed Zamani , Gord Lueck , Everest Chen , Flint Luu
DOI:
关键词:
摘要: Search clarification has recently attracted much attention due to its applications in search engines. It also been recognized as a major component conversational information seeking systems. Despite importance, the research community still feels lack of large-scale data for studying different aspects clarification. In this paper, we introduce MIMICS, collection datasets real web queries sampled from Bing query logs. Each MIMICS is generated by production algorithm and consists clarifying question up five candidate answers. contains three datasets: (1) MIMICS-Click includes over 400k unique queries, their associated panes, corresponding aggregated user interaction signals (i.e., clicks). (2) MIMICS-ClickExplore an exploration that 60k each with multiple panes. (3) MIMICS-Manual 2k queries. query-clarification pair dataset manually labeled at least trained annotators. graded quality labels question, answer set, landing result page answer. MIMICS publicly available purposes, thus enables researchers study number tasks related clarification, including generation selection, engagement prediction click models analyzing interactions