Asymptotic minimax regret for data compression, gambling and prediction

作者: Qun Xie , A.R. Barron

DOI: 10.1109/18.825803

关键词:

摘要: For problems of data compression, gambling, and prediction individual sequences x/sub 1/, /spl middot//spl middot/, n/ the following questions arise. Given a target family probability mass functions p(x/sub n/|/spl theta/), how do we choose function q(x/sub n/) so that it approximately minimizes maximum regret/belowdisplayskip10ptminus6pt max (log1/q(x/sub n/)-log1/p(x/sub theta//spl circ/)) achieves best constant C in asymptotics minimax regret, which is form (d/2)log(n/2/spl pi/)+C+o(1), where d parameter dimension? Are there easily implementable strategies q achieve those asymptotics? And does solution to worst case sequence problem relate corresponding expectation version min/sub q/ max/sub 0/ E/sub 0/(log1/q(x/sub theta/))? In discrete memoryless case, with given alphabet size m, Bayes procedure Dirichlet(1/2, 1/2) prior asymptotically maximin. Simple modifications are shown be minimax. The C/sub m/=log(/spl Gamma/(1/2)/sup m//(/spl Gamma/(m/2)) agrees logarithm integral square root determinant Fisher information. Moreover, our optimal for also version. Analogous conclusions prediction, compression when, each observation, one has access side information from an k. this setting regret k(m-1)/2logn/2/spl pi/k+kC/sub m/+o(1).

参考文章(32)
Joe Suzuki, Some Notes on Universal Noiseless Coding IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences. ,vol. 78, pp. 1840- 1847 ,(1995)
A. Barron, David Haussler, HOW WELL DO BAYES METHODS WORK FOR ON-LINE PREDICTION OF {+- 1} VALUES? University of California at Santa Cruz. ,(1992)
Andrew R Barron, None, Are Bayes Rules Consistent in Information Springer, New York, NY. pp. 85- 91 ,(1987) , 10.1007/978-1-4612-4808-8_22
Yoav Freund, Predicting a binary sequence almost as well as the optimal biased coin Proceedings of the ninth annual conference on Computational learning theory - COLT '96. pp. 89- 98 ,(1996) , 10.1145/238061.238072
David Haussler, Manfred Opper, MUTUAL INFORMATION, METRIC ENTROPY AND CUMULATIVE RELATIVE ENTROPY RISK Annals of Statistics. ,vol. 25, pp. 2451- 2492 ,(1997) , 10.1214/AOS/1030741081
R. Krichevsky, V. Trofimov, The performance of universal encoding IEEE Transactions on Information Theory. ,vol. 27, pp. 199- 207 ,(1981) , 10.1109/TIT.1981.1056331
Dean P. Foster, Prediction in the Worst Case Annals of Statistics. ,vol. 19, pp. 1084- 1090 ,(1991) , 10.1214/AOS/1176348140
Erik Ordentlich, Thomas M. Cover, The Cost of Achieving the Best Portfolio in Hindsight Mathematics of Operations Research. ,vol. 23, pp. 960- 982 ,(1998) , 10.1287/MOOR.23.4.960
Bertrand S Clarke, Andrew R Barron, None, Jeffreys' prior is asymptotically least favorable under entropy risk Journal of Statistical Planning and Inference. ,vol. 41, pp. 37- 60 ,(1994) , 10.1016/0378-3758(94)90153-8
T. Klove, Bounds on the worst case probability of undetected error IEEE Transactions on Information Theory. ,vol. 41, pp. 298- 300 ,(1995) , 10.1109/18.370096