作者: Sharon Goldwater , Tom Griffiths
DOI:
关键词: Trigram 、 Discriminative model 、 Bayesian probability 、 Machine learning 、 Prior probability 、 Natural language 、 Computer science 、 Pattern recognition 、 Artificial intelligence 、 Unsupervised learning 、 Hidden Markov model 、 Generative model
摘要: Unsupervised learning of linguistic structure is a difficult problem. A common approach to define generative model and maximize the probability hidden given observed data. Typically, this done using maximum-likelihood estimation (MLE) parameters. We show part-of-speech tagging that fully Bayesian can greatly improve performance. Rather than estimating single set parameters, integrates over all possible parameter values. This difference ensures learned will have high range permits use priors favoring sparse distributions are typical natural language. Our has standard trigram HMM, yet its accuracy closer state-of-the-art discriminative (Smith Eisner, 2005), up 14 percentage points better MLE. find improvements both when training from data alone, dictionary.