作者: Thomas Krause , Amir Zeldes
DOI: 10.1093/LLC/FQU057
关键词:
摘要: This article is concerned with the data structures, properties of query languages, and visualization facilities required for generic representation richly annotated, heterogeneous linguistic corpora. We propose that above beyond a general graph-based model, which becoming increasingly popular in many complex annotation formats, well-defined concept multiple, potentially conflicting segmentation layers must be introduced to deal different sources applications corpus flexibly. also solution specialized visualizations Web interface using annotation-triggered style sheets, leverage power modern browsers CSS multiple highly customizable views primary data. offer an implementation evaluation our architecture ANNIS3, open-source browser-based search visualization. present three case studies test coverage system, encompassing core digital humanities use-cases including annotated newspaper treebanks, multilingual diplomatic normalized manuscript materials edited TEI, analysis multimodal recordings spoken language.