作者: Sajid Anwar , Kyuyeon Hwang , Wonyong Sung
DOI: 10.1145/3005348
关键词: Machine learning 、 Computational complexity theory 、 Computational resource 、 Importance Weight 、 Artificial intelligence 、 Deep learning 、 Particle filter 、 Computer science 、 Convolutional neural network 、 Kernel (image processing)
摘要: Real-time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning a promising technique to solve this problem. However, usually results in irregular network connections that not only demand extra representation efforts but also do fit well on parallel computation. We introduce structured sparsity at various scales for convolutional neural networks: feature map-wise, kernel-wise, intra-kernel strided sparsity. This very advantageous direct resource savings embedded computers, computing environments, hardware-based systems. To decide the importance paths, proposed method uses particle filtering approach. The weight each assigned assessing misclassification rate with corresponding connectivity pattern. pruned retrained compensate losses due pruning. While implementing convolutions as matrix products, we particularly show simple constraint can significantly reduce size kernel map tensors. work shows when granularities are applied combination, prune CIFAR-10 more than 70% less 1% loss accuracy.