Fast semi-automatic semantic annotation for spoken dialog systems.

作者: Ruhi Sarikaya , Paola Virga , Yuqing Gao

DOI:

关键词:

摘要: This paper describes a bootstrapping methodology for semi– automatic semantic annotation of “mini–corpus” that is conventionally annotated manually to train an initial parser used in natural language understanding (NLU) systems. We propose cast the problem as classification problem: each word assigned unique set tag(s) and/or label(s) from universal tag/label set. approach enables “local” resulting partially sentences. The proposed method reduces time and cost forms major bottleneck development NLU present experiments conducted on medical domain “mini– corpus” contains 10K hand–annotated Three methods are compared: (baseline), similarity classification–based annotations. support vector machine (SVM) based scheme shown outperform both parsed–based annotation.

参考文章(6)
S. Abney, Part-of-Speech Tagging and Partial Parsing Springer Netherlands. pp. 118- 136 ,(1997) , 10.1007/978-94-017-1183-8_4
Kadri Hacioglu, Wayne Ward, Target word detection and semantic role chunking using support vector machines Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology companion volume of the Proceedings of HLT-NAACL 2003--short papers - NAACL '03. pp. 25- 27 ,(2003) , 10.3115/1073483.1073492
H.M. Meng, Kai-Chung Siu, Semiautomatic acquisition of semantic structures for understanding domain-specific natural language queries IEEE Transactions on Knowledge and Data Engineering. ,vol. 14, pp. 172- 181 ,(2002) , 10.1109/69.979980
Taku Kudoh, Yuji Matsumoto, Use of support vector learning for chunk identification Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning -. pp. 142- 144 ,(2000) , 10.3115/1117601.1117635
Bowen Zhou, Yuqing Gao, J. Sorensen, D. Dechelotte, M. Picheny, A hand-held speech-to-speech translation system 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721). pp. 664- 669 ,(2003) , 10.1109/ASRU.2003.1318519
Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, BLEU Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02. pp. 311- 318 ,(2001) , 10.3115/1073083.1073135