Optimal Bounds for Estimating Entropy with PMF Queries

作者: Cafer Caferov , Barış Kaya , Ryan O’Donnell , AC Cem Say , None

DOI: 10.1007/978-3-662-48054-0_16

关键词:

摘要: Let p be an unknown probability distribution on \([n] := \{1, 2, \dots n\}\) that we can access via two kinds of queries: A SAMP query takes no input and returns \(x \in [n]\) with p[x]; a PMF as the value p[x]. We consider task estimating entropy to within \(\pm \varDelta \) (with high probability). For usual Shannon H(p), show \(\varOmega (\log ^2 n/\varDelta ^2)\) queries are necessary, matching recent upper bound Canonne Rubinfeld. Renyi \(H_\alpha \)(p), where \(\alpha >1\), \(\varTheta (n^{1-1/\alpha })\) necessary sufficient. This complements work Acharya et al. in \(\mathsf \)-only model showed \(O(n^{1-1/\alpha suffice when is integer, but \(\widetilde{\varOmega }(n)\) noninteger. All our lower bounds also easily extend CDF (given x, return \(\sum _{y \le x}\) p[y]) allowed.

参考文章(20)
Ashwin Lall, Vyas Sekar, Mitsunori Ogihara, Jun Xu, Hui Zhang, Data streaming algorithms for estimating entropy of network traffic ACM SIGMETRICS Performance Evaluation Review. ,vol. 34, pp. 145- 156 ,(2006) , 10.1145/1140103.1140295
Liam Paninski, Estimation of entropy and mutual information Neural Computation. ,vol. 15, pp. 1191- 1253 ,(2003) , 10.1162/089976603321780272
Gregory Valiant, Paul Valiant, Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs symposium on the theory of computing. pp. 685- 694 ,(2011) , 10.1145/1993636.1993727
Nicholas J.A. Harvey, Jelani Nelson, Krzysztof Onak, Sketching and Streaming Entropy via Approximation Theory foundations of computer science. pp. 489- 498 ,(2008) , 10.1109/FOCS.2008.76
Clément L. Canonne, A Survey on Distribution Testing: Your Data is Big. But is it Blue? Electronic Colloquium on Computational Complexity. ,vol. 22, pp. 63- ,(2015)
Andrew McGregor, Amit Chakrabarti, Graham Cormode, A near-optimal algorithm for computing the entropy of a stream symposium on discrete algorithms. pp. 328- 335 ,(2007) , 10.5555/1283383.1283418
Alon Orlitsky, Ananda Theertha Suresh, Jayadev Acharya, Himanshu Tyagi, The complexity of estimating Rényi entropy symposium on discrete algorithms. pp. 1855- 1869 ,(2015) , 10.5555/2722129.2722253
Paul Valiant, Gregory Valiant, A CLT and tight lower bounds for estimating entropy. Electronic Colloquium on Computational Complexity. ,vol. 17, pp. 179- ,(2010)
Clément Canonne, Ronitt Rubinfeld, Testing Probability Distributions Underlying Aggregated Data Automata, Languages, and Programming. pp. 283- 295 ,(2014) , 10.1007/978-3-662-43948-7_24
Amit Chakrabarti, Khanh Do Ba, S. Muthukrishnan, Estimating entropy and entropy norm on data streams symposium on theoretical aspects of computer science. ,vol. 3, pp. 196- 205 ,(2006) , 10.1007/11672142_15