Layer-specific approximate multipliers for energy-precision trade-offs in convolutional neural networks.
Ladan Sayadi, Mohammad Hossein Moaiyeri, Somayeh Timarchi
Abstract
Open AccessApproximate computing is a promising paradigm for error-resilient applications, such as image and signal processing and neural networks, prioritizing hardware efficiency over precision. This paper presents a novel CNN-specific approximation methodology divided into three sections. The first section identifies the characteristics of suitable approximate multipliers for CNNs, highlighting how variance in weight distribution affects the error tolerance of different layers. Section Two introduces three types of approximate multipliers (AM_5×5, AM_4×4, and AM_3×3), designed using innovative operand truncation techniques. These multipliers enable adjustable accuracy by varying the number of truncated operand bits and are scalable to any multiplier size. Two algorithms are also proposed: one optimizes training with approximate multipliers for improved performance, and another employs a gradual training strategy. Section Three describes two distinct strategies for deploying the methodology within CNN architectures and evaluates its hardware implementation on an ASIC in 28 nm CMOS technology. Comprehensive comparisons using VGG16, VGG10, and AlexNet architectures reveal significant improvements in energy efficiency. The first strategy achieves energy efficiency gains of up to 86%, 95%, and 88% per operation for VGG10, VGG16, and AlexNet, respectively, while the second strategy achieves improvements of 81%, 92%, and 84% for the same networks. This approach effectively balances computational complexity and accuracy while leveraging CNN features to enhance hardware efficiency. Experimental results validate the potential of this methodology to advance CNN designs, optimizing both energy and hardware resources for practical applications.