摘要: External regret compares the performance of an online algorithm, selecting among N actions, to best those actions in hindsight. Internal loss algorithm a modified which consistently replaces one action by another. In this paper, we give simple generic reduction that, given for external problem, converts it efficient internal problem. We provide methods that work both full information model, every is observed at each time step, and partial (bandit) where step only selected observed. The importance game theory due fact general game, if player has sublinear regret, then empirical frequencies converge correlated equilibrium. For also derive quantitative bound very setting includes arbitrary set modification rules (that possibly modify algorithm) selection functions (each giving different weight step). rule difference between cost costs are weighted function. This can be viewed as generalization previously-studied sleeping experts setting.