作者: Paul Piwek , Svetlana Stoyanchev
DOI:
关键词: Dialogue acts 、 Computer science 、 Natural language processing 、 Paraphrase 、 Linguistics 、 Rhetorical question 、 Artificial intelligence 、 Corpus linguistics 、 Annotation 、 Structure (mathematical logic) 、 Coda
摘要: We describe the construction of CODA corpus, a parallel corpus monologues and expository dialogues. The dialogue part consists expository, i.e., information-delivering rather than dramatic, dialogues written by several acclaimed authors. monologue is paraphrase in form these human annotator. was constructed as resource for extracting rules automated generation from monologue. Using authored allows us to analyse techniques used accomplished writers presenting information dialogue. are annotated with acts rhetorical structure. developed annotation translation guidelines together custom-developed tool carrying out translation, alignment annotation.