Fast Estimation of Gaussian Mixture Model Parameters on GPU Using CUDA

作者: Lukas Machlica , Jan Vanek , Zbynek Zajic

DOI: 10.1109/PDCAT.2011.40

关键词:

摘要: Gaussian Mixture Models (GMMs) are widely used among scientists e.g. in statistics toolkits and data mining procedures. In order to estimate parameters of a GMM the Maximum Likelihood (ML) training is often utilized, more precisely Expectation-Maximization (EM) algorithm. Nowadays, lot tasks works with huge datasets, what makes estimation process time consuming (mainly for complex mixture models containing hundreds components). The paper presents an efficient robust implementation EM algorithm on GPU using NVIDIA's Compute Unified Device Architecture (CUDA). Also augmentation standard CPU version proposed utilizing SSE instructions. Time consumptions presented methods tested large dataset real speech from NIST Speaker Recognition Evaluation (SRE) 2008. Estimation proves be than 400 times faster 130 version, thus speed up was achieved without any approximations made formulas. Proposed also compared other implementations developed by departments over world proved fastest (at least 5 best published recently).

参考文章(13)
Andrew D. Pangborn, Scalable data clustering using GPUs ,(2010)
Jan Trmal, Josef V. Psutka, Josef Psutka, Jan Vanek, Optimization of the Gaussian Mixture Model Evaluation on GPU conference of the international speech communication association. pp. 1737- 1740 ,(2011)
Wen-mei W. Hwu, David B. Kirk, Programming Massively Parallel Processors: A Hands-on Approach Morgan Kaufmann. ,(2012)
James W. Demmel, Vasily Volkov, Benchmarking GPUs to tune dense linear algebra ieee international conference on high performance computing data and analytics. pp. 31- ,(2008) , 10.5555/1413370.1413402
Claudia Plant, Christian Bohm, Parallel EM-Clustering: Fast Convergence by Asynchronous Model Updates international conference on data mining. pp. 178- 185 ,(2010) , 10.1109/ICDMW.2010.53
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, Kevin Skadron, A performance study of general-purpose applications on graphics processors using CUDA Journal of Parallel and Distributed Computing. ,vol. 68, pp. 1370- 1380 ,(2008) , 10.1016/J.JPDC.2008.05.014
N. S. L. Phani Kumar, Sanjiv Satoor, Ian Buck, Fast Parallel Expectation Maximization for Gaussian Mixture Models on GPUs Using CUDA high performance computing and communications. pp. 103- 109 ,(2009) , 10.1109/HPCC.2009.45
Xindong Wu, Vipin Kumar, J Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda, Geoffrey J McLachlan, Angus Ng, Bing Liu, Philip S Yu, Zhi-Hua Zhou, Michael Steinbach, David J Hand, Dan Steinberg, None, Top 10 algorithms in data mining Knowledge and Information Systems. ,vol. 14, pp. 1- 37 ,(2007) , 10.1007/S10115-007-0114-2
W.M. Campbell, D.E. Sturim, D.A. Reynolds, A. Solomonoff, SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 97- 100 ,(2006) , 10.1109/ICASSP.2006.1659966