A distributed architecture for robust automatic speech recognition

作者: K. Hacioglu , B. Pellom

DOI: 10.1109/ICASSP.2003.1198784

关键词: Message passingRobustness (computer science)Speech enhancementInteroperabilitySpeech recognitionServerComputer scienceNatural languageDistributed computing

摘要: In this paper, we attempt to decompose a state-of-the-art speech recognition system into its components and define an infrastructure that allows flexible, efficient effective interaction among the components. Motivated by success of DARPA Communicator program, select open source Galaxy architecture as our development test bed. It consists hub communication servers connected it message passing supports plug-and-play paradigm. addition high bandwidth data (binary or audio) transfer between via brokering scheme. For several reasons, believe is right time start developing distributed framework for along with protocol standards supporting interoperability. We present work towards goal using Colorado University (CU) Sonic recognizer. divide number structure around Hub. describe in some detail report on status possibilities future development.

参考文章(5)
Victor Zue, Christine Pao, Edward Hurley, Stephanie Seneff, Philipp Schmid, Raymond Lau, GALAXY-II: a reference architecture for conversational system development. conference of the international speech communication association. ,(1998)
Y. Ephraim, D. Malah, Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 33, pp. 443- 445 ,(1984) , 10.1109/TASSP.1985.1164550
Kadri Hacioglu, Wayne Ward, A concept graph based confidence measure international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 225- 228 ,(2002) , 10.1109/ICASSP.2002.5743695
Christopher J Leggetter, Philip C Woodland, None, Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models Computer Speech & Language. ,vol. 9, pp. 171- 185 ,(1995) , 10.1006/CSLA.1995.0010
Ruhi Sarikaya, Bryan L. Pellom, Umit H. Yapanel, John H. L. Hansen, Robust Speech Recognition in Noise: An Evaluation using the SPINE Corpus † conference of the international speech communication association. pp. 905- 908 ,(2001)