作者: Paul Rayson , Scott Piao , Serge Sharoff , Stefan Evert , Begoña Villada Moirón
DOI: 10.1007/S10579-009-9105-0
关键词: Semantics 、 Term (time) 、 Computational linguistics 、 Multiword expression 、 Chen 、 Speech recognition 、 Interpretation (logic) 、 Computer science 、 Noun compounds 、 Phraseology 、 Subject (grammar) 、 Linguistics
摘要: Over the past two decades or so, Multi-Word Expressions (MWEs; also called Multi-word Units) have been an increasingly important concern for Computational Linguistics and Natural Language Processing (NLP). The term MWE has used to refer various types of linguistic units expressions, including idioms, noun compounds, phrasal verbs, light verbs other habitual collocations. However, while there is no universally agreed definition as yet, most researchers use those frequently occurring which are subject certain level semantic opaqueness, non-compositionality. Non-compositional MWEs pose tough challenges automatic analysis because their interpretation cannot be achieved by directly combining semantics constituents, thereby causing "pain in neck NLP" (Sag et al. 2001). In fact, studied Phraseology under phraseological unit. But early 1990s, started receiving increasing attention corpus-based computational linguistics NLP. Early influential work on includes Smadja (1993), Dagan Church (1994), Wu (1997), Daille (1995), Wermter Chen McEnery Michiels Dufour (1998). These studies address treatment applications practical NLP information systems. A milestone