Flexible Quantization for Efficient Convolutional Neural Networks

Zacchigna, Federico Giordano; Lew, Sergio Eduardo; Lutenberg, Ariel

doi:10.3390/electronics13101923

Artículo

Flexible Quantization for Efficient Convolutional Neural Networks

Zacchigna, Federico Giordano; Lew, Sergio Eduardo; Lutenberg, Ariel Icon

Fecha de publicación: 05/2024

Editorial: MDPI

Revista: Electronics

ISSN: 2079-9292

Idioma: Inglés

Tipo de recurso: Artículo publicado

Clasificación temática:

Otras Ingeniería Eléctrica, Ingeniería Electrónica e Ingeniería de la Información

Resumen

This work focuses on the efficient quantization of convolutional neural networks (CNNs). Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a novel quantization methodology that combines the benefits of non-uniform quantization, such as high compression levels, with the advantages of uniform quantization, which enables an efficient implementation in fixed-point hardware. NUUQ is based on decoupling the quantization levels from the number of bits. This decoupling allows for a trade-off between the spatial and temporal complexity of the implementation, which can be leveraged to further reduce the spatial complexity of the CNN, without a significant performance loss. Additionally, we explore different quantization configurations and address typical use cases. The NUUQ algorithm demonstrates the capability to achieve compression levels equivalent to 2 bits without an accuracy loss and even levels equivalent to ∼1.58 bits, but with a loss in performance of only ∼0.6%.

Palabras clave: CNN , quantization , uniform , non-uniform , mixed-precision , FPGA , ASIC , edge devices , embedded systems

Ver el registro completo

Archivos asociados

Tamaño: 925.8Kb

Formato: PDF

Descargar

Licencia

Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution 2.5 Unported (CC BY 2.5)

Identificadores

URI: http://hdl.handle.net/11336/236859

URL: https://www.mdpi.com/2079-9292/13/10/1923

DOI: http://dx.doi.org/10.3390/electronics13101923

Colecciones

Articulos(SEDE CENTRAL)
Articulos de SEDE CENTRAL

Citación

Zacchigna, Federico Giordano; Lew, Sergio Eduardo; Lutenberg, Ariel; Flexible Quantization for Efficient Convolutional Neural Networks; MDPI; Electronics; 13; 10; 5-2024; 1-16

Altmétricas