The Case for Asymmetric Systolic Array Floorplanning
Date
2023-09Publisher
IEEESource
IEEE International Workshop on Cellular Nanoscale Networks and their Applications (CNNA)Google Scholar check
Metadata
Show full item recordAbstract
The widespread proliferation of deep learning applications has triggered the need to accelerate them directly in hardware. General Matrix Multiplication (GEMM) kernels are elemental deep-learning constructs and they inherently map onto Systolic Arrays (SAs). SAs are regular structures that are well-suited for accelerating matrix multiplications. Typical SAs use a pipelined array of Processing Elements (PEs), which communicate with local connections and pre-orchestrated data movements. In this work, we show that the physical layout of SAs should be asymmetric to minimize wirelength and improve energy efficiency. The floorplan of the SA adjusts better to the asymmetric widths of the horizontal and vertical data buses and their switching activity profiles. It is demonstrated that such physically asymmetric SAs reduce interconnect power by 9.1% when executing state-of-the-art Convolutional Neural Network
(CNN) layers, as compared to SAs of the same size but with a square (i.e., symmetric) layout. The savings in interconnect power translate, in turn, to 2.1% overall power savings.