The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems

作者: Ryan Lowe , Nissan Pow , Joelle Pineau , Iulian Serban

DOI:

关键词:

摘要: This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with total of over 7 utterances and 100 words. provides unique resource for research into building dialogue managers based on neural language models that can make use large amounts unlabeled data. The has both property conversations in Dialog State Tracking Challenge datasets, unstructured nature interactions from microblog services such as Twitter. We also describe two learning architectures suitable analyzing this dataset, provide benchmark performance task selecting best next response.

参考文章(28)
Caglar Gulcehre, Yoshua Bengio, Yoshua Bengio, Yoshua Bengio, Fethi Bougares, Bart van Merrienboer, Holger Schwenk, Kyunghyun Cho, Dzmitry Bahdanau, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation arXiv: Computation and Language. ,(2014)
Alessandro Sordoni, Chris Brockett, Jianfeng Gao, Margaret Mitchell, Michael Auli, Yangfeng Ji, Chris Quirk, Bill Dolan, Michel Galley, deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets arXiv: Computation and Language. ,(2015)
Nicolas Usunier, Jason Weston, Antoine Bordes, Open Question Answering with Weakly Supervised Embedding Models arXiv: Computation and Language. ,(2014)
Matthew D. Zeiler, ADADELTA: An Adaptive Learning Rate Method arXiv: Learning. ,(2012)
BA Abu Shawar, ES Atwell, Chatbots: are they really useful? Ldv Forum. ,vol. 22, pp. 29- 49 ,(2007)
Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, Bill Dolan, A Neural Network Approach to Context-Sensitive Generation of Conversational Responses north american chapter of the association for computational linguistics. pp. 196- 205 ,(2015) , 10.3115/V1/N15-1020
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
Phil Blunsom, Stephen Pulman, Karl Moritz Hermann, Lei Yu, Deep learning for answer sentence selection arXiv: Computation and Language. ,(2014)
Colin Cherry, Alan Ritter, Bill Dolan, Unsupervised Modeling of Twitter Conversations north american chapter of the association for computational linguistics. pp. 172- 180 ,(2010)
Diane Litman, Satinder Singh, Michael Kearns, Marilyn Walker, Optimizing dialogue management with reinforcement learning: experiments with the NJFun system Journal of Artificial Intelligence Research. ,vol. 16, pp. 105- 133 ,(2002) , 10.1613/JAIR.859