An Unbiased Data Collection and Content Exploitation/Exploration Strategy for Personalization.

作者: Adnan Boz , Liangjie Hong

DOI:

关键词: Component (UML)Scheme (programming language)User engagementRecommender systemComputer scienceMachine learningRankingArtificial intelligenceData collectionPersonalizationInformation retrieval

摘要: One of missions for personalization systems and recommender is to show content items according users' personal interests. In order achieve such goal, these are learning user interests over time trying present tailoring profiles. Recommending preferences has been investigated extensively in the past few years, mainly thanks popularity Netflix competition. a real setting, users may be attracted by subset those interact with them, only leaving partial feedbacks system learn next cycle, which leads significant biases into hence results situation where engagement metrics cannot improved time. The problem not just one component system. data collected from usually used many different tasks, including ranking functions, building profiles constructing classifiers. Once biased, all downstream use cases would impacted as well. Therefore, it beneficial gather unbiased through interactions. Traditionally, collection done showing uniformly sampling pool. However, this simple scheme feasible risks takes long feedbacks. paper, we introduce user-friendly framework, utilizing methods developed exploitation exploration literature. We discuss how framework normal multi-armed bandit problems why method needed. layout novel Thompson Bernoulli ranked-list effectively balance experiences collection. proposed validated bucket test strong comparing old algorithms

参考文章(8)
Yishay Mansour, Shie Mannor, Aditya Gopalan, Thompson Sampling for Complex Online Problems international conference on machine learning. pp. 100- 108 ,(2014)
Liang Tang, Romer Rosales, Ajit Singh, Deepak Agarwal, None, Automatic ad format selection via contextual bandits Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13. pp. 1587- 1594 ,(2013) , 10.1145/2505515.2514700
Deepak Agarwal, Bee-Chung Chen, Pradheep Elango, Explore/Exploit Schemes for Web Content Optimization international conference on data mining. pp. 1- 10 ,(2009) , 10.1109/ICDM.2009.52
Lihong Li, Jin Young Kim, Imed Zitouni, Toward Predicting the Outcome of an A/B Experiment for Search Relevance web search and data mining. pp. 37- 46 ,(2015) , 10.1145/2684822.2685311
Olivier Chapelle, Lihong Li, An Empirical Evaluation of Thompson Sampling neural information processing systems. ,vol. 24, pp. 2249- 2257 ,(2011)
Lihong Li, Wei Chu, John Langford, Robert E. Schapire, A contextual-bandit approach to personalized news article recommendation the web conference. pp. 661- 670 ,(2010) , 10.1145/1772690.1772758
Lars Schmidt-Thieme, Zeno Gantner, Steffen Rendle, Christoph Freudenthaler, BPR: Bayesian personalized ranking from implicit feedback uncertainty in artificial intelligence. pp. 452- 461 ,(2009)
Sham Kakade, Lihong Li, John Langford, Alex Strehl, Learning from Logged Implicit Exploration Data arXiv: Learning. ,(2010)