Sequential prediction of individual sequences under general loss functions

作者: D. Haussler , J. Kivinen , M.K. Warmuth

DOI: 10.1109/18.705569

关键词: Function (mathematics)Probabilistic logicMathematical optimizationMathematicsInterval (mathematics)SequenceBinary logarithmConstant (mathematics)CombinatoricsBinary numberRegret

摘要: We consider adaptive sequential prediction of arbitrary binary sequences when the performance is evaluated using a general loss function. The goal to predict on each individual sequence nearly as well best strategy in given comparison class (possibly adaptive) strategies, called experts. By function, we generalize previous work universal prediction, forecasting, and data compression. However, here restrict ourselves case finite. For sequence, define regret total entire suffered by predictor, minus predictor that performs particular sequence. show for large functions, minimax either /spl theta/(log N) or Omega/(/spl radic//spl Lscr/log N), depending where N number predictors and/spl Lscr/ length be predicted. former was shown previously Vovk (1990); give simplified analysis with an explicit closed form constant formula, probabilistic argument shows this possible. Some weak regularity conditions are imposed function obtaining these results. also extend our predicting take real values interval [0,1].

参考文章(34)
A. Barron, David Haussler, HOW WELL DO BAYES METHODS WORK FOR ON-LINE PREDICTION OF {+- 1} VALUES? University of California at Santa Cruz. ,(1992)
Nicolo Cesa-Bianchi, Manfred K. Warmuth, Philip M. Long, WORST-CASE QUADRATIC LOSS BOUNDS FOR ON-LINE PREDICTION OF LINEAR FUNCTIONS BY GRADIENT DESCENT University of California at Santa Cruz. ,(1993)
Manfred Opper, David Haussler, Worst Case Prediction over Sequences under Log Loss The Mathematics of Information Coding, Extraction and Distribution. pp. 81- 90 ,(1999) , 10.1007/978-1-4612-1524-0_6
Nicolò Cesa-Bianchi, Yoav Freund, David P. Helmbold, Manfred K. Warmuth, On-line prediction and conversion strategies european conference on computational learning theory. ,vol. 25, pp. 205- 216 ,(1994) , 10.1023/A:1018348209754
David P. Helmbold, Robert E. Schapire, Yoram Singer, Manfred K. Warmuth, ON-LINE PORTFOLIO SELECTION USING MULTIPLICATIVE UPDATES Mathematical Finance. ,vol. 8, pp. 325- 347 ,(1998) , 10.1111/1467-9965.00058
N. Cesa-Bianchi, P.M. Long, M.K. Warmuth, Worst-case quadratic loss bounds for prediction using linear functions and gradient descent IEEE Transactions on Neural Networks. ,vol. 7, pp. 604- 619 ,(1996) , 10.1109/72.501719
A. DeSantis, G. Markowsky, M.N. Wegman, Learning probabilistic prediction functions [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science. pp. 110- 119 ,(1988) , 10.1109/SFCS.1988.21929
Yoav Freund, Predicting a binary sequence almost as well as the optimal biased coin Proceedings of the ninth annual conference on Computational learning theory - COLT '96. pp. 89- 98 ,(1996) , 10.1145/238061.238072
Nicolò Cesa-Bianchi, Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire, Manfred K. Warmuth, How to use expert advice Journal of the ACM. ,vol. 44, pp. 427- 485 ,(1997) , 10.1145/258128.258179