作者: Heriberto Cuayahuitl , Nina Dethlefs , Helen Hastie , Xingkun Liu
关键词: Similarity (geometry) 、 Quality (business) 、 Training set 、 Surface (mathematics) 、 Natural language processing 、 Online learning 、 Unlabelled data 、 Function (mathematics) 、 Artificial intelligence 、 Computer science 、 Pattern recognition 、 Labelling
摘要: Training a statistical surface realiser typically relies on labelled training data or parallel sets, such as corpora of paraphrases. The procedure for obtaining new domains is not only time-consuming, but it also restricts the incorporation semantic slots during an interaction, i.e. using online learning scenario automatically extended domains. Here, we present alternative approach to realisation from unlabelled through automatic slot labelling. essence our algorithm cluster clauses based similarity function that combines lexical and information. Annotations need be reliable enough utilised within spoken dialogue system. We compare different functions evaluate realiser—trained data—in human rating study. Results confirm trained labels can lead outputs comparable quality human-labelled inputs.