Dualizing Le Cam's method, with applications to estimating the unseens

作者: Yury Polyanskiy , Yihong Wu

DOI:

关键词: Distribution (mathematics)CombinatoricsEstimatorExponential familyUpper and lower boundsMinimaxMathematicsQuadratic equationModulus of continuitySeparable space

摘要: Le Cam's method (or the two-point method) is a commonly used tool for obtaining statistical lower bound and especially popular functional estimation problems. This work aims to explain give conditions tightness of in from perspective convex duality. Under variety settings it shown that maximization problem searches best bound, upon dualizing, becomes minimization optimizes bias-variance tradeoff among family estimators. For estimating linear functionals distribution our strengthens prior results Donoho-Liu \cite{DL91} (for quadratic loss) by dropping Holderian assumption on modulus continuity. exponential families extend those Juditsky-Nemirovski \cite{JN09} characterizing minimax risk loss under weaker assumptions family. We also provide an extension high-dimensional setting separable functionals. Notably, coupled with tools complex analysis, this particularly effective ``elbow effect'' -- phase transition parametric nonparametric rates. As main application we derive sharp rates Distinct elements (given fraction $p$ colored balls urn containing $d$ balls, optimal error number distinct colors $\tilde \Theta(d^{-\frac{1}{2}\min\{\frac{p}{1-p},1\}})$) Fisher's species $n$ iid observations unknown distribution, prediction unseen symbols next (unobserved) $r \cdot n$ \Theta(n^{-\min\{\frac{1}{r+1},\frac{1}{2}\}})$).

参考文章(34)
Gregory John Valiant, Christos Papadimitriou, Algorithmic approaches to statistical questions University of California at Berkeley. ,(2012)
Boris Y. Levit, Richard D. Gill, Applications of the van Trees inequality: a Bayesian Cramér-Rao bound Bernoulli. ,vol. 1, pp. 59- 79 ,(1995) , 10.2307/3318681
Alexandre B. Tsybakov, Introduction to Nonparametric Estimation ,(2008)
Walter Rudin, Real and complex analysis ,(1966)
Shachar Lovett, Jiapeng Zhang, Improved Noisy Population Recovery, and Reverse Bonami-Beckner Inequality for Sparse Functions symposium on the theory of computing. pp. 137- 142 ,(2015) , 10.1145/2746539.2746540
I. Ionita-Laza, C. Lange, N. M. Laird, Estimating the number of unseen variants in the human genome. Proceedings of the National Academy of Sciences of the United States of America. ,vol. 106, pp. 5008- 5013 ,(2009) , 10.1073/PNAS.0807815106