Training conditional random fields via gradient tree boosting

作者： Thomas G. Dietterich , Adam Ashenfelter , Yaroslav Bulatov

关键词:

摘要: Conditional Random Fields (CRFs; Lafferty, McCallum, & Pereira, 2001) provide a flexible and powerful model for learning to assign labels elements of sequences in such applications as part-of-speech tagging, text-to-speech mapping, protein DNA sequence analysis, information extraction from web pages. However, existing algorithms are slow, particularly problems with large numbers potential input features. This paper describes new method training CRFs by applying Friedman's (1999) gradient tree boosting method. In boosting, the CRF functions represented weighted sums regression trees. Regression trees learned stage-wise optimizations similar Adaboost, but objective maximizing conditional likelihood P(Y|X) model. By growing trees, interactions among features introduced only needed, so although parameter space is potentially immense, search algorithm does not explicitly consider space. As result, scales linearly order Markov feature interactions, rather than exponentially like previous based on iterative scaling descent.

oregonstate.edu PDF 下载加速

oregonstate.edu PDF 下载加速

acm.org LINK 下载加速

sci-hub.se PDF 下载加速

参考文章(16)

Structural, syntactic, and statistical pattern recognition Lecture Notes in Computer Science. ,vol. 6218, ,(2002) , 10.1007/978-3-642-14980-1

Thomas G. Dietterich, Machine Learning for Sequential Data: A Review Lecture Notes in Computer Science. pp. 15- 30 ,(2002) , 10.1007/3-540-70659-3_2

T. J. Sejnowski, Parallel networks that learn to pronounce English text Complex Systems. ,vol. 1, pp. 145- 168 ,(1987)

Richard A Olshen, Charles J Stone, Leo Breiman, Jerome H Friedman, Classification and regression trees ,(1983)

D. Geman, Random fields and inverse problems in imaging Springer, Berlin, Heidelberg. pp. 115- 193 ,(1990) , 10.1007/BFB0103042

Jerome H. Friedman, Greedy function approximation: A gradient boosting machine. Annals of Statistics. ,vol. 29, pp. 1189- 1232 ,(2001) , 10.1214/AOS/1013203451

Adwait Ratnaparkhi, A Maximum Entropy Model for Part-Of-Speech Tagging empirical methods in natural language processing. ,(1996)

Andrew McCallum, Dayne Freitag, Fernando C. N. Pereira, Maximum Entropy Markov Models for Information Extraction and Segmentation international conference on machine learning. pp. 591- 598 ,(2000)

JULIAN BESAG, Efficiency of pseudolikelihood estimation for simple Gaussian fields Biometrika. ,vol. 64, pp. 616- 618 ,(1977) , 10.1093/BIOMET/64.3.616

10.

Robert E. Schapire, Yoav Freund, Experiments with a new boosting algorithm international conference on machine learning. pp. 148- 156 ,(1996)

Training conditional random fields via gradient tree boosting

来源期刊

我的账户

Training conditional random fields via gradient tree boosting

来源期刊

相似文章 10

我的账户