Robust information extraction from automatically generated speech transcriptions

作者: David D. Palmer , Mari Ostendorf , John D. Burger

DOI: 10.1016/S0167-6393(00)00026-1

关键词:

摘要: This paper describes a robust system for information extraction (IE) from spoken language data. The extends previous hidden Markov model (HMM) work in IE, using state topology designed explicit modeling of variable-length phrases and class-based statistical smoothing to produce state-of-the-art performance wide range speech error rates. Experiments on broadcast news data show that the performs well with temporal source differences In addition, strategies integrating word-level confidence estimates into are introduced, showing improved by generic token incorrectly recognized words training low test

参考文章(31)
Man-Hung Siu, Fred Richardson, Herbert Gish, Improved estimation, evaluation and applications of confidence measures for speech recognition. conference of the international speech communication association. ,(1997)
Thomas Kemp, Thomas Schaaf, Estimating confidence using word lattices. conference of the international speech communication association. ,(1997)
Steve Renals, Yoshihiko Gotoh, Statistical annotation of named entities in spoken audio. ,(1999)
Mari Ostendorf, Rukmini Iyer, Transforming out-of-domain estimates to improve in-domain language models. conference of the international speech communication association. ,(1997)
Richard Schwartz, David Miller, Ralph Weischedel, Rebecca Stone, Named Entity Extraction from Broadcast News ,(1999)
Daniel M. Bikel, Richard Schwartz, Ralph M. Weischedel, An Algorithm that Learns What‘s in a Name Machine Learning. ,vol. 34, pp. 211- 231 ,(1999) , 10.1023/A:1007558221122
Eric David Brill, A corpus-based approach to language learning University of Pennsylvania. ,(1993)
Steve Renals, Yoshihiko Gotoh, Integrated transcription and identification of named entities in broadcast speech. conference of the international speech communication association. ,(1999)
Robert Gaizauskas, Steve Renals, Yoshihiko Gotoh, Mark Stevenson, BASELINE IE-NE EXPERIMENTS USING THE SPRACH/LASIE SYSTEM ,(1999)
Chinatsu Aone, Scott W. Bennett, Learning to Tag Multilingual Texts Through Observation empirical methods in natural language processing. ,(1997)