Recurrent neural networks for voice activity detection

作者： Thad Hughes , Keir Mierle

DOI: 10.1109/ICASSP.2013.6639096

关键词:

摘要: We present a novel recurrent neural network (RNN) model for voice activity detection. Our multi-layer RNN model, in which nodes compute quadratic polynomials, outperforms much larger baseline system composed of Gaussian mixture models (GMMs) and hand-tuned state machine (SM) temporal smoothing. All parameters our are optimized together, so that it properly weights its preference continuity against the acoustic features each frame. uses one tenth GMM+SM by 26% reduction false alarms, reducing overall speech recognition computation time 17% while word error rate 1% relative.

googleusercontent.com PDF 下载加速

ai.google LINK 下载加速

uni-trier.de PDF 下载加速

ieee.org LINK 下载加速

sci-hub.se PDF 下载加速

research.google PDF 下载加速

参考文章(15)

Renato de Mori, Roberto Gemello, Franco Mana, Non-linear estimation of voice activity to improve automatic recognition of noisy speech. conference of the international speech communication association. pp. 2617- 2620 ,(2005)

Ilya Sutskever, Geoffrey E. Hinton, James Martens, Generating Text with Recurrent Neural Networks international conference on machine learning. pp. 1017- 1024 ,(2011)

Ilya Sutskever, James Martens, Learning Recurrent Neural Networks with Hessian-Free Optimization international conference on machine learning. pp. 1033- 1040 ,(2011)

David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams, Learning representations by back-propagating errors Nature. ,vol. 323, pp. 696- 699 ,(1988) , 10.1038/323533A0

Oliver Obst, Martin Riedmiller, Taming the reservoir: Feedforward training for recurrent neural networks international joint conference on neural network. pp. 1- 7 ,(2012) , 10.1109/IJCNN.2012.6252506

Leonard E. Baum, Ted Petrie, George Soules, Norman Weiss, A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains Annals of Mathematical Statistics. ,vol. 41, pp. 164- 171 ,(1970) , 10.1214/AOMS/1177697196

Gin-Der Wu, Chin-Teng Lin, A recurrent neural fuzzy network for word boundary detection in variable noise-level environments systems man and cybernetics. ,vol. 31, pp. 84- 97 ,(2001) , 10.1109/3477.907566

Oriol Vinyals, Suman V. Ravuri, Daniel Povey, Revisiting Recurrent Neural Networks for robust ASR 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 4085- 4088 ,(2012) , 10.1109/ICASSP.2012.6288816

Jongseo Sohn, Nam Soo Kim, Wonyong Sung, A statistical model-based voice activity detection IEEE Signal Processing Letters. ,vol. 6, pp. 1- 3 ,(1999) , 10.1109/97.736233

10.

Louis B. Rall, Automatic Differentiation: Techniques and Applications ,(1981)

Recurrent neural networks for voice activity detection

来源期刊

我的账户

Recurrent neural networks for voice activity detection

来源期刊

相似文章 10

我的账户