Single-Channel Multitalker Speech Recognition

作者: Steven Rennie , John Hershey , Peder Olsen

DOI: 10.1109/MSP.2010.938081

关键词:

摘要: We have described some of the problems with modeling mixed acoustic signals in log spectral domain using graphical models, as well current approaches to handling these for multitalker speech separation and recognition. also reviewed methods inference on FHMMs (factorial hidden Markov model) nonlinear interaction function domain. These are capable separating recognizing better than human listeners SSC task.

参考文章(46)
Sam T. Roweis, Factorial models and refiltering for speech separation and denoising. conference of the international speech communication association. ,(2003)
Tuomas Virtanen, Speech recognition using factorial hidden Markov models for separation in the feature space. conference of the international speech communication association. ,(2006)
Paris Smaragdis, Bhiksha Raj, Kevin W. Wilson, Regularized non-negative matrix factorization with temporal dependencies for speech denoising conference of the international speech communication association. pp. 411- 414 ,(2008)
Trausti T. Kristjansson, Brendan J. Frey, Alex Acero, Li Deng, ALGONQUIN: iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition. conference of the international speech communication association. pp. 901- 904 ,(2001)
Steven J. Rennie, John R. Hershey, Peder A. Olsen, Signal Interaction and the Devil Function conference of the international speech communication association. pp. 334- 337 ,(2010)
Michael I Jordan, Zoubin Ghahramani, Tommi S Jaakkola, Lawrence K Saul, None, An introduction to variational methods for graphical models Machine Learning. ,vol. 37, pp. 105- 161 ,(1999) , 10.1023/A:1007665907178
Nit Friedman, Tal El-Hay, Incorporating expressive graphical models in variational approximations: chain-graphs and hidden variables uncertainty in artificial intelligence. pp. 136- 143 ,(2001)
Zoubin Ghahramani, Michael Jordan, None, Factorial Hidden Markov Models neural information processing systems. ,vol. 29, pp. 472- 478 ,(1995) , 10.1023/A:1007425814087
A.P. Varga, R.K. Moore, Hidden Markov model decomposition of speech and noise international conference on acoustics, speech, and signal processing. pp. 845- 848 ,(1990) , 10.1109/ICASSP.1990.115970
Frederick Jelinek, Continuous speech recognition by statistical methods Proceedings of the IEEE. ,vol. 64, pp. 532- 556 ,(1976) , 10.1109/PROC.1976.10159