QUENN : QUantization engine for low-power neural networks

De Prado, Miguel (Integrated Systems Laboratory, ETH Zürich Switzerland ; School of Engineering – HE-Arc Ingénierie, HES-SO // University of Applied Sciences Western Switzerland) ; Benini, Luca (Integrated Systems Laboratory, ETH Zürich Switzerland) ; Denna, Maurizio (Nviso, Switzerland) ; Pazos, Nuria (School of Engineering – HE-Arc Ingénierie, HES-SO // University of Applied Sciences Western Switzerland)

Deep Learning is moving to edge devices, ushering in a new age of distributed Artificial Intelligence (AI). The high demand of computational resources required by deep neural networks may be alleviated by approximate computing techniques, and most notably reduced-precision arithmetic with coarsely quantized numerical representations. In this context, Bonseyes comes in as an initiative to enable stakeholders to bring AI to low-power and autonomous environments such as: Automotive, Medical Healthcare and Consumer Electronics. To achieve this, we introduce LPDNN, a framework for optimized deployment of Deep Neural Networks on heterogeneous embedded devices. In this work, we detail the quantization engine that is integrated in LPDNN. The engine depends on a fine-grained workflow which enables a Neural Network Design Exploration and a sensitivity analysis of each layer for quantization. We demonstrate the engine with a case study on Alexnet and VGG16 for three different techniques for direct quantization: standard fixed-point, dynamic fixed-point and k-means clustering, and demonstrate the potential of the latter. We argue that using a Gaussian quantizer with k-means clustering can achieve better performance than linear quantizers. Without retraining, we achieve over 55.64% saving for weights' storage and 69.17% for run-time memory accesses with less than 1% drop in top5 accuracy in Imagenet.


Conference Type:
full paper
Faculty:
Ingénierie et Architecture
School:
HE-Arc Ingénierie
Institute:
Aucun institut
Subject(s):
Ingénierie
Publisher:
Ischia, Italy, 08-10 May 2018
Date:
2018-05
Ischia, Italy
08-10 May 2018
Pagination:
9 p.
Published in:
Proceedings of the 15th ACM International Conference on Computing Frontiers, 8-10 May 2018, Ischia, Italy
DOI:
ISBN:
9781450357616
Appears in Collection:



 Record created 2019-03-26, last modified 2019-04-23

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)