Optimal Contextual Pricing and Extensions

作者: Renato Paes Leme , Jon Schneider , Allen Liu

DOI:

关键词:

摘要: In the contextual pricing problem a seller repeatedly obtains products described by an adversarially chosen feature vector in $\mathbb{R}^d$ and only observes purchasing decisions of buyer with fixed but unknown linear valuation over products. The regret measures difference between revenue could have obtained knowing what can be learning algorithm. We give poly-time algorithm for $O(d \log T + d d)$ which matches $\Omega(d T)$ lower bound up to $d d$ additive factor. If we replace loss symmetric loss, obtain nearly optimal matching $\Omega(d)$ $\log d$. These algorithms are based on novel technique bounding value Steiner polynomial convex region at various scales. is degree $d$ intrinsic volumes as coefficients. also study generalized version search where hidden function Euclidean space replaced $f : \mathcal{X} \rightarrow \mathcal{Y}$ certain hypothesis class $\mathcal{H}$. We provide generic $O(d^2)$ covering dimension this class. This leads particular $\tilde{O}(s^2)$ if guaranteed $s$-sparse. Finally extend our results noisy feedback model, each round flipped probability $p < 1/2$.

参考文章(23)
Ravi Kannan, L�szl� Lov�sz, Mikl�s Simonovits, Random walks and an O * ( n 5 ) volume algorithm for convex bodies Random Structures and Algorithms. ,vol. 11, pp. 1- 50 ,(1997) , 10.1002/(SICI)1098-2418(199708)11:1<1::AID-RSA1>3.0.CO;2-X
Peter L. Bartlett, Philip M. Long, Robert C. Williamson, Fat-shattering and the learnability of real-valued functions Proceedings of the seventh annual conference on Computational learning theory - COLT '94. ,vol. 52, pp. 299- 310 ,(1994) , 10.1145/180139.181158
László Lovász, Hit-and-run mixes fast Mathematical Programming. ,vol. 86, pp. 443- 461 ,(1999) , 10.1007/S101070050099
Dimitris Bertsimas, Santosh Vempala, Solving convex programs by random walks Journal of the ACM. ,vol. 51, pp. 540- 556 ,(2004) , 10.1145/1008731.1008733
R. Kleinberg, T. Leighton, The value of knowing a demand curve: bounds on regret for online posted-price auctions foundations of computer science. pp. 594- 605 ,(2003) , 10.1109/SFCS.2003.1238232
Jacob D. Abernethy, Elad Hazan, Alexander Rakhlin, Interior-Point Methods for Full-Information and Bandit Online Learning IEEE Transactions on Information Theory. ,vol. 58, pp. 4164- 4175 ,(2012) , 10.1109/TIT.2012.2192096
Umar Syed, Kareem Amin, Afshin Rostamizadeh, Repeated Contextual Auctions with Strategic Buyers neural information processing systems. ,vol. 27, pp. 622- 630 ,(2014)
Aleksandrs Slivkins, Contextual bandits with similarity information Journal of Machine Learning Research. ,vol. 15, pp. 2533- 2568 ,(2014)
Adam Kalai, Santosh Vempala, Efficient algorithms for online decision problems Journal of Computer and System Sciences. ,vol. 71, pp. 291- 307 ,(2005) , 10.1016/J.JCSS.2004.10.016