Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors

作者: Jan Vanek , Jan Trmal , Josef V. Psutka , Josef Psutka

DOI: 10.1109/TASL.2012.2190928

关键词:

摘要: In this paper, we describe an optimized version of a Gaussian-mixture-based acoustic model likelihood evaluation algorithm for graphical processing units (GPUs). The these likelihoods is one the most computationally intensive parts automatic speech recognizers, but it can be parallelized and offloaded to GPU devices. Our approach offers significant speed-up over recently published approaches, because utilizes architecture in more effective manner. All recent implementations have been intended only NVIDIA graphics processors, programmed either CUDA or OpenCL programming frameworks. We present results both OpenCL. Further, developed implementation ATI/AMD GPUs. Results suggest that even very large models used real-time recognition engines on computers equipped with low-end laptops. addition, completely asynchronous management provides additional CPU resources decoder part LVCSR. enables us apply fusion techniques together evaluating many (10 more) speaker-specific models. technique parliamentary system where speaker changes frequently.

参考文章(14)
Kurt Keutzer, Ekaterina Gonina, Jike Chong, Youngmin Yi, A fully data parallel WFST-based large vocabulary continuous speech recognition on a graphics processing unit. conference of the international speech communication association. pp. 1183- 1186 ,(2009)
Pierre Dumouchel, Gilles Boulianne, Patrick Cardinal, Michel Comeau, GPU accelerated acoustic likelihood computations. conference of the international speech communication association. pp. 964- 967 ,(2008)
Miroslav Novak, Pavel Kveton, Accelerating hierarchical acoustic likelihood computation on graphics processors. conference of the international speech communication association. pp. 350- 353 ,(2010)
Aleš Pražák, J. V. Psutka, Jan Hoidekr, Jakub Kanis, Luděk Müller, Josef Psutka, Automatic Online Subtitling of the Czech Parliament Meetings Text, Speech and Dialogue. pp. 501- 508 ,(2006) , 10.1007/11846406_63
Wen-mei W. Hwu, David B. Kirk, Programming Massively Parallel Processors: A Hands-on Approach Morgan Kaufmann. ,(2012)
Paul R. Dixon, Tasuku Oonishi, Sadaoki Furui, Harnessing graphics processors for the fast computation of acoustic likelihoods in speech recognition Computer Speech & Language. ,vol. 23, pp. 510- 526 ,(2009) , 10.1016/J.CSL.2009.03.005
Paul R. Dixon, Tasuku Oonishi, Sadaoki Furui, Fast acoustic computations using graphics processors international conference on acoustics, speech, and signal processing. pp. 4321- 4324 ,(2009) , 10.1109/ICASSP.2009.4960585
James W. Demmel, Vasily Volkov, Benchmarking GPUs to tune dense linear algebra ieee international conference on high performance computing data and analytics. pp. 31- ,(2008) , 10.5555/1413370.1413402
Kshitij Gupta, John D. Owens, Three-layer optimizations for fast GMM computations on GPU-like parallel processors ieee automatic speech recognition and understanding workshop. pp. 146- 151 ,(2009) , 10.1109/ASRU.2009.5373410