作者: Bernd Kiefer , Ann A. Copestake , Benjamin Waldron , Ulrich Schäfer
DOI:
关键词: Minimal recursion semantics 、 XML 、 Annotation 、 Middleware (distributed applications) 、 Preprocessor 、 Programming language 、 Rule-based machine translation 、 Computer science 、 Interface (Java) 、 Parsing
摘要: We discuss preprocessing and tokenisation standards within DELPH-IN, a large scale open-source collaboration providing multiple independent multilingual shallow deep processors. (i) component-specific XML interface format which has been used for some time to preprocessor results the PET parser, (ii) our implementation of more generic influenced heavily by (ISO working draft) Morphosyntactic Annotation Framework (MAF). Our encapsulates information may be passed from stage parser: it uses standoff-annotation, lattice representation structural ambiguity, intra-annotation dependencies allows highly structured annotation content. This work builds on existing Heart Gold middleware system, previous Robust Minimal Recursion Semantics (RMRS) as part an inter-component interface. give examples usage with number DELPH-IN processing components grammars.