作者: Vincent Gripon , Yoshua Bengio , Matthieu Arzel , Nicolas Farrugia , Ghouthi Boukli Hacene
DOI:
关键词:
摘要: Convolutional Neural Networks (CNNs) are state-of-the-art in numerous computer vision tasks such as object classification and detection. However, the large amount of parameters they contain leads to a high computational complexity strongly limits their usability budget-constrained devices embedded devices. In this paper, we propose combination new pruning technique quantization scheme that effectively reduce memory usage convolutional layers CNNs, replace complex operation by low-cost multiplexer. We perform experiments on CIFAR10, CIFAR100 SVHN show proposed method achieves almost accuracy, while drastically reducing footprints. also an efficient hardware architecture accelerate CNN operations. The is pipeline accommodates multiple working at same time speed up inference process.