作者: Martin Emms
DOI:
关键词:
摘要: Over the past decade Machine Learning techniques has become an essentialtool for Natural Language Processing. This introductory course will coverthe basics of and present a selection widely used al-gorithms, illustrating them with practical applications to LanguageProcessing. The start survey main concepts inMachine Learning, in terms decisions one needs make whendesigning application, including: type training (su-pervised, unsupervised, active learning etc), data representation, choice andrepresentation target function, algorithm. befollowed by case studies designed illustrate theseconcepts. Unsupervised (clustering) be describedand illustrated through tasks such as thesaurus induction,document class inference, term extraction text classification. Super-vised covering symbolic (e.g. decision trees) non-symbolic approaches probabilistic classifiers, instance-based classifiers,support vector machines) presented on clas-sification word sense disambiguation analysed some detail. Finally,we address issue functions which assign struc-tures linear sequences Hidden Markov Modelsto part-of-speech tagging Probabilistic Grammars parsing naturallanguage.