Fault injection in GPGPU cores to validate and debug robust parallel applications

作者: M. De Carvalho , D. Sabena , M. Sonza Reorda , L. Sterpone , P. Rech

DOI: 10.1109/IOLTS.2014.6873699

关键词:

摘要: General Purpose Graphic Processing Units (GPGPUs) are more efficient than CPUs for processing parallel data. Unfortunately, GPGPUs sensible to radiation. Hence, several software mitigation techniques, as well robust algorithms, being developed overcome reliability problems. In this paper we propose a debugger-based fault injection mechanism evaluate the resiliency of applications running on GPGPU and validate hardening techniques it possibly embeds. We report some experimental results gathered selected case studies show proposed approach advantages limitations.

参考文章(11)
Paolo Rech, Thomas D. Fairbanks, Heather M. Quinn, Luigi Carro, Threads Distribution Effects on Graphics Processing Units Neutron Sensitivity IEEE Transactions on Nuclear Science. ,vol. 60, pp. 4220- 4225 ,(2013) , 10.1109/TNS.2013.2286970
Stefano Di Carlo, Giulio Gambardella, Marco Indaco, Ippazio Martella, Paolo Prinetto, Daniele Rolfo, Pascal Trotta, A software-based self test of CUDA Fermi GPUs european test symposium. pp. 1- 6 ,(2013) , 10.1109/ETS.2013.6569353
Sotiris Tselonis, Vasilis Dimitsas, Dimitris Gizopoulos, The functional and performance tolerance of GPUs to permanent faults in registers international on-line testing symposium. pp. 236- 239 ,(2013) , 10.1109/IOLTS.2013.6604089
P. Rech, C. Frost, L. Carro, Degree of Parallelism Variations Effects on GPUs Reliability european conference on radiation and its effects on components and systems. pp. 1- 6 ,(2013) , 10.1109/RADECS.2013.6937430
Jingweijia Tan, Nilanjan Goswami, Tao Li, Xin Fu, Analyzing soft-error vulnerability on GPGPU microarchitecture ieee international symposium on workload characterization. pp. 226- 235 ,(2011) , 10.1109/IISWC.2011.6114182
L. L. Pilla, P. Rech, F. Silvestri, C. Frost, P. O. A. Navaux, M. Sonza Reorda, L. Carro, Software-Based Hardening Strategies for Neutron Sensitive FFT Algorithms on GPUs IEEE Transactions on Nuclear Science. ,vol. 61, pp. 1874- 1880 ,(2014) , 10.1109/TNS.2014.2301768
N. DeBardeleben, S. Gurumurthi, M. Sonza Reorda, F. Cappello, P. Rech, B. Fang, L. Carro, K. Pattabiraman, L. Bautista Gomez, GPGPUs: how to combine high computational power with high reliability design, automation, and test in europe. pp. 341- ,(2014) , 10.5555/2616606.2617090
Benjamin Ranft, Timo Schoenwald, Bernd Kitt, Parallel matching-based estimation - a case study on three different hardware architectures ieee intelligent vehicles symposium. pp. 1060- 1067 ,(2011) , 10.1109/IVS.2011.5940521
Stefano Di Carlo, Giulio Gambardella, Ippazio Martella, Paolo Prinetto, Daniele Rolfo, Pascal Trotta, Fault mitigation strategies for CUDA GPUs international test conference. pp. 1- 8 ,(2013) , 10.1109/TEST.2013.6651908
Zhe Fan, Feng Qiu, A. Kaufman, S. Yoakum-Stover, GPU Cluster for High Performance Computing conference on high performance computing (supercomputing). pp. 47- 47 ,(2004) , 10.1109/SC.2004.26