作者: Heriberto Cuayahuitl , Nina Dethlefs , Helen Hastie
关键词:
摘要: Work on training semantic slot labellers for use in Natural Language Processing applications has typically either relied large amounts of labelled input data, or assumed entirely unlabelled inputs. The former technique tends to be costly apply, while the latter is often not as accurate its supervised counterpart. Here, we present a semi-supervised learning approach that automatically labels slots set data and aims strike balance between dependence prediction accuracy. essence our algorithm cluster clauses based similarity function combines lexical information. We experiments compare different functions both setting fully unsupervised baseline. While expectedly outperforms learning, results show (1) this effect can observed very few instances increasing size does lead better performance, (2) information contribute differently domains so clustering types offers best generalisation.