Leveraging Web Query Logs to Learn User Intent Via Bayesian Discrete Latent Variable Model

作者: Gokhan Tür , Asli Celikyilmaz , Dilek Hakkani-Tür

DOI:

关键词: Spoken languageLatent variable modelFeature modelTask (project management)Bayesian probabilityComputer scienceThe InternetWeb search queryUtteranceArtificial intelligenceNatural language processing

摘要: A key task in Spoken Language Understanding (SLU) is interpreting user intentions from speech utterances. This considered to be a classification problem with the goal of categorizing given utterance into one many semantic intent classes. Due substantial var, significant quantity labeled utterances needed build robust detection systems. In this paper, we approach as two-stage semi-supervised learning problem, which utilizes large number unlabeled queries collected internet seach engine click logs. We first capture underlying structure using bayesian latent feature model. then propagate onto obtain quality training data via graph summarization algorithm. Our improves compared comparison our baseline, uses standard model actual features.

参考文章(9)
Yee Whye Teh, Dilan Grür, Zoubin Ghahramani, None, Stick-breaking Construction for the Indian Buffet Process international conference on artificial intelligence and statistics. pp. 556- 563 ,(2007)
Pat Langley, Crafting Papers on Machine Learning international conference on machine learning. pp. 1207- 1216 ,(2000)
Dilek Hakkani-Tur, Larry Heck, Gokhan Tur, Exploiting query click logs for utterance domain detection in spoken language understanding 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 5636- 5639 ,(2011) , 10.1109/ICASSP.2011.5947638
Xiao Li, Understanding the Semantic Structure of Noun Phrase Queries meeting of the association for computational linguistics. pp. 1337- 1345 ,(2010)
Matthew J Beal, Francesco Falciani, Zoubin Ghahramani, Claudia Rangel, David L Wild, None, A Bayesian approach to reconstructing genetic regulatory networks with hidden factors Bioinformatics. ,vol. 21, pp. 349- 356 ,(2005) , 10.1093/BIOINFORMATICS/BTI014
Asli Celikyilmaz, Marcus Thint, Zhiheng Huang, A Graph-based Semi-Supervised Learning for Question-Answering international joint conference on natural language processing. pp. 719- 727 ,(2009) , 10.3115/1690219.1690247
Xiao Li, Ye-Yi Wang, Alex Acero, Learning query intent from regularized click graphs Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08. pp. 339- 346 ,(2008) , 10.1145/1390334.1390393
Hal Daume, Piyush Rai, The Infinite Hierarchical Factor Regression Model neural information processing systems. ,vol. 21, pp. 1321- 1328 ,(2008)