TIGER: Linguistic Interpretation of a German Corpus

作者: Sabine Brants , Stefanie Dipper , Peter Eisenberg , Silvia Hansen-Schirra , Esther König

DOI: 10.1007/S11168-004-7431-3

关键词:

摘要: This paper reports on the TIGER Treebank, a corpus of currently 40,000 syntactically annotated German newspaper sentences. We describe what kind information is encoded in treebank and introduce different representation formats that are used for annotation exploitation treebank. explain methods annotation: interactive annotation, using tool ANNOTATE, LFG parsing. Furthermore, we give an account scheme extended improved version NEGRA illustrate detail linguistic extensions were made concerning project. The main differences concerned with coordination, verb-subcategorization, expletives as well proper nouns. In addition, also presents query TIGERSearch was developed project to exploit adequate way. language which designed facilitate simple formulation complex queries; furthermore, shortly in, graphical user interface input. concludes summary some directions future work.

参考文章(28)
Ralph Grishman, Satoshi Sekine, Susana López, Fernando Sánchez, Antonio Moreno, A treebank of Spanish and its application to parsing language resources and evaluation. ,(2000)
Thorsten Brants, Wojciech Skut, Hans Uszkoreit, Syntactic Annotation of a German Newspaper Corpus Treebanks. pp. 73- 87 ,(2003) , 10.1007/978-94-010-0201-1_5
Stefanie Dipper, Grammar-Based Corpus Annotation international conference on computational linguistics. pp. 56- 64 ,(2000)
Sabine Brants, Silvia Hansen, Developments in the TIGER Annotation Scheme and their Realization in the Corpus language resources and evaluation. ,(2002)
Thorsten Brants, Inter-annotator Agreement for a German Newspaper Corpus language resources and evaluation. ,(2000)
Sidney Greenbaum, Comparing English worldwide : the International Corpus of English Clarendon Press , Oxford University Press. ,(1996)
Wolfgang Wahlster, None, Verbmobil : foundations of speech-to-speech translation Springer Berlin Heidelberg. ,(2000) , 10.1007/978-3-662-04230-4
Esther König, Wolfgang Lezius, Towards a Search Engine for Syntactically Annotated Corpora sprachkommunikation. pp. 113- 116 ,(2000)
Ineke Schuurman, Heleen Hoekstra, Ton van der Wouden, Machteld Schouppe, CGN, an annotated corpus of spoken Dutch Proceedings of 4th International Workshop on Linguistically Interpreted Corpora (LINC-03) at EACL 2003. pp. 101- 108 ,(2003)