Dependency Treebank of Urdu and its Evaluation

作者： Riyaz Ahmad Bhat , Dipti Misra Sharma

DOI:

关键词:

摘要: In this paper we describe a currently underway treebanking effort for Urdu-a South Asian language. The treebank is built from newspaper corpus and uses Karaka based grammatical framework inspired by Paninian theory. Thus far 3366 sentences (0.1M words) have been annotated with the linguistic information at morpho-syntactic (morphological, part-of-speech chunk information) syntactico-semantic (dependency) levels. This work also aims to evaluate correctness or reliability of manual dependency treebank. Evaluation done measuring inter-annotator agreement on manually data set 196 (5600 two annotators. We present qualitative analysis statistics identify possible reasons disagreement between show syntactic annotation some constructions specific Urdu like Ezafe discuss problem word segmentation (tokenization).

aclweb.org 本地加速

uni-trier.de 本地加速

aclweb.org PDF 下载加速

参考文章(27)

Kemal Oflazer, Bilge Say, Dilek Zeynep Hakkani-Tür, Gökhan Tür, Building a Turkish Treebank Treebanks. pp. 261- 277 ,(2003) , 10.1007/978-94-010-0201-1_15

Owen Rambow, Rachel Szekely, Marilyn A. Walker, Harriet Taber, Cassandre Creswell, A Dependency Treebank for English language resources and evaluation. ,(2002)

Tina Bögel, Sebsatian Sulger, Miriam Butt, Urdu Ezafe and the Morphology-Syntax Interface ,(2008)

Chung Yong Lim, Hwee Tou Ng, Shou King Foo, A Case Study on Inter-Annotator Agreement for Word Sense Disambiguation SIGLEX99: Standardizing Lexical Resources. ,(1999)

D. N. S. Bhat, Grammatical Relations: The Evidence Against Their Necessity and Universality ,(1991)

Tara Warrier Mohanan, Arguments in Hindi University Microfilms International. ,(1990)

Miriam Butt, Proceedings of LFG08 ,(2008)

The Alpino Dependency Treebank computational linguistics in the netherlands. pp. 8- 22 ,(2002) , 10.1163/9789004334038_003

Frank Reichartz, Hannes Korte, Gerhard Paass, Dependency Tree Kernels for Relation Extraction from Natural Language Text european conference on machine learning. pp. 270- 285 ,(2009) , 10.1007/978-3-642-04174-7_18

10.

K.V. Ramakrishnamacharyulu, Akshar Bharati, Vineet Chaitanya, Rajeev Sangal, Natural language processing : a Paninian perspective Prentice-Hall of India. ,(1996)

Dependency Treebank of Urdu and its Evaluation

来源期刊

我的账户

Dependency Treebank of Urdu and its Evaluation

来源期刊

相似文章 10

我的账户