An Optimal Greedy Heuristic with Minimal Learning Regret for the Markov Chain Choice Model

作者: Guillermo Gallego , Wentao Lu

DOI: 10.2139/SSRN.3810470

关键词: HeuristicsMarkov chainGreedy algorithmMathematical optimizationMulti-armed banditLocal optimumOptimization problemHeuristicDiscrete choiceComputer science

摘要: We study the assortment optimization problem and show that local optima are global for all discrete choice models can be represented by Markov Chain model. develop a forward greedy heuristic finds an optimal model runs in $O(n^2)$ iterations. The has performance bound $1/n$ any regular which is best possible among polynomial heuristics. also propose backward chain requires fewer Numerical results our heuristics performs significantly better than estimate then optimize method revenue-ordered when ground truth latent class multinomial logit Based on heuristics, we learning algorithm enjoys asymptotic regret avoids parameter estimations, focusing instead binary comparisons of revenues.

参考文章(26)
R. L. Plackett, The Analysis of Permutations Journal of The Royal Statistical Society Series C-applied Statistics. ,vol. 24, pp. 193- 202 ,(1975) , 10.2307/2346567
G. L. Nemhauser, L. A. Wolsey, M. L. Fisher, An analysis of approximations for maximizing submodular set functions--I Mathematical Programming. ,vol. 14, pp. 265- 294 ,(1978) , 10.1007/BF01588971
Daniel McFadden, Econometric Models for Probabilistic Choice Among Products The Journal of Business. ,vol. 53, pp. 13- 29 ,(1980) , 10.1086/296093
Guillermo Gallego, Richard Ratliff, Sergey Shebalov, A General Attraction Model and Sales-Based Linear Program for Network Revenue Management Under Customer Choice Operations Research. ,vol. 63, pp. 212- 232 ,(2015) , 10.1287/OPRE.2014.1328
Paat Rusmevichientong, John N. Tsitsiklis, Linearly Parameterized Bandits Mathematics of Operations Research. ,vol. 35, pp. 395- 411 ,(2010) , 10.1287/MOOR.1100.0446
H C W L Williams, On the Formation of Travel Demand Models and Economic Evaluation Measures of User Benefit Environment and Planning A. ,vol. 9, pp. 285- 344 ,(1977) , 10.1068/A090285
Vivek F. Farias, Srikanth Jagabathula, Devavrat Shah, A Nonparametric Approach to Modeling Choice with Limited Data Management Science. ,vol. 59, pp. 305- 322 ,(2013) , 10.1287/MNSC.1120.1610
James M. Davis, Guillermo Gallego, Huseyin Topaloglu, Assortment Optimization Under Variants of the Nested Logit Model Operations Research. ,vol. 62, pp. 250- 273 ,(2014) , 10.1287/OPRE.2014.1256
Stochastic Choice and Consideration Sets Econometrica. ,vol. 82, pp. 1153- 1176 ,(2014) , 10.3982/ECTA10575
Peter Auer, Using confidence bounds for exploitation-exploration trade-offs Journal of Machine Learning Research. ,vol. 3, pp. 397- 422 ,(2003)