A Semi-supervised Clustering Approach for Semantic Slot Labelling

作者: Heriberto Cuayahuitl , Nina Dethlefs , Helen Hastie

DOI: 10.1109/ICMLA.2014.87

关键词:

摘要: Work on training semantic slot labellers for use in Natural Language Processing applications has typically either relied large amounts of labelled input data, or assumed entirely unlabelled inputs. The former technique tends to be costly apply, while the latter is often not as accurate its supervised counterpart. Here, we present a semi-supervised learning approach that automatically labels slots set data and aims strike balance between dependence prediction accuracy. essence our algorithm cluster clauses based similarity function combines lexical information. We experiments compare different functions both setting fully unsupervised baseline. While expectedly outperforms learning, results show (1) this effect can observed very few instances increasing size does lead better performance, (2) information contribute differently domains so clustering types offers best generalisation.

参考文章(34)
Benjamin Snyder, Regina Barzilay, Database-text alignment via structured multilabel classification international joint conference on artificial intelligence. pp. 1713- 1718 ,(2007)
Gabor Angeli, Percy Liang, Dan Klein, A Simple Domain-Independent Probabilistic Approach to Generation empirical methods in natural language processing. pp. 502- 512 ,(2010)
Marina Danilevsky, Jialu Liu, Jiawei Han, Chi Wang, Large-scale spectral clustering on graphs international joint conference on artificial intelligence. pp. 1486- 1492 ,(2013)
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Brian Stankiewicz, Benjamin Kuipers, Matt MacMahon, Walk the talk: connecting language, knowledge, and action in route instructions national conference on artificial intelligence. pp. 1475- 1482 ,(2006)
Mirella Lapata, Ioannis Konstas, Unsupervised Concept-to-text Generation with Hypergraphs north american chapter of the association for computational linguistics. pp. 752- 761 ,(2012)
Nathan Schneider, Desai Chen, Dipanjan Das, Noah A. Smith, Probabilistic Frame-Semantic Parsing north american chapter of the association for computational linguistics. pp. 948- 956 ,(2010)
Kristina Toutanova, Dan Klein, Christopher D. Manning, Yoram Singer, Feature-rich part-of-speech tagging with a cyclic dependency network Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03. pp. 173- 180 ,(2003) , 10.3115/1073445.1073478
Gregory F. Cooper, Edward Herskovits, A Bayesian Method for the Induction of Probabilistic Networks from Data Machine Learning. ,vol. 9, pp. 309- 347 ,(1992) , 10.1023/A:1022649401552
Heriberto Cuayahuitl, Nina Dethlefs, Helen Hastie, Xingkun Liu, Training a statistical surface realiser from automatic slot labelling spoken language technology workshop. pp. 112- 117 ,(2014) , 10.1109/SLT.2014.7078559