作者: Ryan Lowe , Nissan Pow , Joelle Pineau , Iulian Serban
DOI:
关键词:
摘要: This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with total of over 7 utterances and 100 words. provides unique resource for research into building dialogue managers based on neural language models that can make use large amounts unlabeled data. The has both property conversations in Dialog State Tracking Challenge datasets, unstructured nature interactions from microblog services such as Twitter. We also describe two learning architectures suitable analyzing this dataset, provide benchmark performance task selecting best next response.