Tweet comma corpus Janes-Vejica 1.0

作者： Darja Fišer , Damjan Popič , Teja Kavčič , Polona Logar , Tomaž Erjavec

DOI:

关键词: Natural language processing 、 Sentence segmentation 、 Linguistics 、 On Language 、 Word (computer architecture) 、 Computer-mediated communication 、 Manual annotation 、 Computer science 、 Typology 、 Artificial intelligence

摘要: Janes-Vejica is a corpus of Slovene tweets where commas are annotated with the reason for their (in)correct use, according to supplied typology. The was sampled from Janes-Norm (http://hdl.handle.net/11356/1084), which manually tokenisation, sentence segmentation, and word normalisation, automatically morphosyntactic descriptions lemmas. The further described in: POPIC, Damjan, FISER, Darja, ZUPAN, Katja, LOGAR, Polona. Raba vejice v uporabniskih spletnih vsebinah. Proceedings Conference on Language Technologies & Digital Humanities, Ljubljana, Slovenia. 2016, pp. 149-153. http://www.sdjt.si/wp/dogodki/konference/jtdh-2016/zbornik/

clarin.si 本地加速

handle.net PDF 下载加速

参考文章(0)

Tweet comma corpus Janes-Vejica 1.0

来源期刊

我的账户

Tweet comma corpus Janes-Vejica 1.0

来源期刊

相似文章 0

我的账户