作者: Robert Malouf
关键词: Gradient descent 、 Principle of maximum entropy 、 Variable (computer science) 、 Estimation theory 、 Algorithm 、 Conjugate gradient method 、 Computer science 、 Generalized iterative scaling 、 Free parameter 、 Mathematical optimization 、 Metric (mathematics)
摘要: Conditional maximum entropy (ME) models provide a general purpose machine learning technique which has been successfully applied to fields as diverse computer vision and econometrics, is used for wide variety of classification problems in natural language processing. However, the flexibility ME not without cost. While parameter estimation conceptually straightforward, practice typical tasks are very large, may well contain many thousands free parameters. In this paper, we consider number algorithms estimating parameters models, including iterative scaling, gradient ascent, conjugate gradient, variable metric methods. Sur-prisingly, standardly scaling perform quite poorly comparison others, all test problems, limited-memory algorithm outperformed other choices.