作者: Markus Gipp , Guillermo Marcus , Nathalie Harder , Apichat Suratanee , Karl Rohr
DOI: 10.1007/978-3-642-25707-0_11
关键词: General-purpose computing on graphics processing units 、 Computer vision 、 CUDA 、 Stream processing 、 Speedup 、 Graphics 、 Parallel computing 、 Computer science 、 Artificial intelligence 、 Computation 、 Thread (computing) 、 Co-occurrence matrix
摘要: In biological applications, features are extracted from microscopy images of cells and used for automated classification. Usually, a huge number has to be analyzed so that computing the takes several weeks or months. Hence, there is demand speed up computation by orders magnitude. This paper extends previous results co-occurrence matrices Haralick texture features, as analyzing cells, general-purpose graphics processing units (GPUs). New GPUs include more cores (480 stream processors) their architecture enables new capabilities (namely, capabilities). With (by atomic functions) we further parallelize matrices. The visually profiling tool was find most critical bottlenecks which investigated improved. Changes in implementation like using threads, avoiding costly barrier synchronizations, better handling with divergent branches, reorganization thread tasks yielded desired performance boost. time one image around 200 compared original software version reference, our first CUDA capability v1.0 improved v1.3. latest obtained an improvement 1.4 version, computed on same GPU (gForce GTX 280). total, achieved speedup 930 recent 480, Fermi) CPU 1.8 older optimized version.