Learning Multiple Markov Chains via Adaptive Allocation

作者: Odalric-Ambrym Maillard , Mohammad Sadegh Talebi

DOI:

关键词:

摘要: We study the problem of learning transition matrices a set Markov chains from single stream observations on each chain. assume that are ergodic but otherwise unknown. The learner can sample sequentially to observe their states. goal is select various learn uniformly well with respect some loss function. introduce notion naturally extends squared for distributions case chains, and further characterize being \emph{uniformly good} in all instances. present novel algorithm efficiently balances \emph{exploration} \emph{exploitation} intrinsic this problem, without any prior knowledge chains. provide finite-sample PAC-type guarantees performance algorithm. Further, we show our asymptotically attains an optimal loss.

参考文章(36)
Alon Orlitsky, Dheeraj Pichapati, Sudeep Kamath, Ananda Theertha Suresh, On Learning Distributions from their Samples conference on learning theory. pp. 1066- 1100 ,(2015)
David A. Levin, Elizabeth L. Wilmer, Y. Peres, Y. Peres, Y. Peres, Markov Chains and Mixing Times ,(2008)
Daniel Paulin, Concentration inequalities for Markov chains by Marton couplings and spectral methods Electronic Journal of Probability. ,vol. 20, pp. 1- 32 ,(2015) , 10.1214/EJP.V20-4039
Pierre Étoré, Benjamin Jourdain, Adaptive Optimal Allocation in Stratified Sampling Methods Methodology and Computing in Applied Probability. ,vol. 12, pp. 335- 360 ,(2010) , 10.1007/S11009-008-9108-0
Pascal Lezaud, Chernoff-type bound for finite Markov chains Annals of Applied Probability. ,vol. 8, pp. 849- 867 ,(1998) , 10.1214/AOAP/1028903453
C. Kipnis, S. R. S. Varadhan, Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusions Communications in Mathematical Physics. ,vol. 104, pp. 1- 19 ,(1986) , 10.1007/BF01210789
Jiantao Jiao, Kartik Venkat, Yanjun Han, Tsachy Weissman, Minimax Estimation of Functionals of Discrete Distributions IEEE Transactions on Information Theory. ,vol. 61, pp. 2835- 2885 ,(2015) , 10.1109/TIT.2015.2412945
Richard L. Tweedie, Sean Meyn, Markov Chains and Stochastic Stability ,(1993)
T.L Lai, Herbert Robbins, Asymptotically efficient adaptive allocation rules Advances in Applied Mathematics. ,vol. 6, pp. 4- 22 ,(1985) , 10.1016/0196-8858(85)90002-8
Bruce A. Craig, Peter P. Sendi, Estimation of the transition matrix of a discrete-time Markov chain. Health Economics. ,vol. 11, pp. 33- 42 ,(2002) , 10.1002/HEC.654