Accelerating Deep Learning Inference via Freezing.

作者: Shivaram Venkataraman , Aditya Akella , Adarsh Kumar , Arjun Balasubramanian

DOI:

关键词:

摘要: … caching at each intermediate layer and we discuss techniques to reduce the cache size and improve the cache … we see that the cache requires a mere 12.5MB of memory for ResNet-18. …

参考文章(0)