作者: Gokhan Tür , Asli Celikyilmaz , Dilek Hakkani-Tür
DOI:
关键词: Spoken language 、 Latent variable model 、 Feature model 、 Task (project management) 、 Bayesian probability 、 Computer science 、 The Internet 、 Web search query 、 Utterance 、 Artificial intelligence 、 Natural language processing
摘要: A key task in Spoken Language Understanding (SLU) is interpreting user intentions from speech utterances. This considered to be a classification problem with the goal of categorizing given utterance into one many semantic intent classes. Due substantial var, significant quantity labeled utterances needed build robust detection systems. In this paper, we approach as two-stage semi-supervised learning problem, which utilizes large number unlabeled queries collected internet seach engine click logs. We first capture underlying structure using bayesian latent feature model. then propagate onto obtain quality training data via graph summarization algorithm. Our improves compared comparison our baseline, uses standard model actual features.