Performance Guarantees for Regularized Maximum Entropy Density Estimation

作者: Miroslav Dudík , Steven J. Phillips , Robert E. Schapire

DOI: 10.1007/978-3-540-27819-1_33

关键词:

摘要: We consider the problem of estimating an unknown probability distribution from samples using principle maximum entropy (maxent). To alleviate overfitting with a very large number features, we propose applying maxent relaxed constraints on expectations features. By convex duality, this turns out to be equivalent finding Gibbs minimizing regularized version empirical log loss. prove non-asymptotic bounds showing that, respect true underlying distribution, produces density estimates that are almost as good best possible. These in terms deviation feature averages relative their expectations, can bounded standard uniform-convergence techniques. In particular, leads drop quickly samples, and depend moderately or complexity also derive convergence for both sequential-update parallel-update algorithms. Finally, briefly describe experiments data relevant modeling species geographical distributions.

参考文章(20)
Vincent J. Della Pietra, Adam L. Berger, Stephen A. Della Pietra, A maximum entropy approach to natural language processing Computational Linguistics. ,vol. 22, pp. 39- 71 ,(1996) , 10.5555/234285.234289
Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer, Smooth ε-Insensitive Regression by Loss Symmetrization Learning Theory and Kernel Machines. ,vol. 6, pp. 433- 447 ,(2003) , 10.1007/978-3-540-45167-9_32
Robert Malouf, A comparison of algorithms for maximum entropy parameter estimation international conference on computational linguistics. pp. 1- 7 ,(2002) , 10.3115/1118853.1118871
Geoffrey E. Hinton, Max Welling, Richard S. Zemel, Self Supervised Boosting neural information processing systems. ,vol. 15, pp. 681- 688 ,(2002)
Saharon Rosset, Eran Segal, Boosting Density Estimation neural information processing systems. ,vol. 15, pp. 657- 664 ,(2002)
Luc Devroye, Bounds for the Uniform Deviation of Empirical Measures Journal of Multivariate Analysis. ,vol. 12, pp. 72- 79 ,(1982) , 10.1016/0047-259X(82)90083-5
Charles Sutton, Khashayar Rohanimanesh, Andrew McCallum, Dynamic conditional random fields Twenty-first international conference on Machine learning - ICML '04. pp. 99- ,(2004) , 10.1145/1015330.1015422
S. Della Pietra, V. Della Pietra, J. Lafferty, Inducing features of random fields IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 19, pp. 380- 393 ,(1997) , 10.1109/34.588021
S.F. Chen, R. Rosenfeld, A survey of smoothing techniques for ME models IEEE Transactions on Speech and Audio Processing. ,vol. 8, pp. 37- 50 ,(2000) , 10.1109/89.817452