• Article  

      ArrayFlex: A Systolic Array Architecture with Configurable Transparent Pipelining 

      Peltekis, Christodoulos; Filippas, Dionysios; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos; Pnevmatikatos, Dionisios (IEEE, 2023-06-02)
      Convolutional Neural Networks (CNNs) are the state-of-the-art solution for many deep learning applications. For maximum scalability, their computation should combine high performance and energy efficiency. In practice, the ...
    • Article  

      The Case for Asymmetric Systolic Array Floorplanning 

      Peltekis, Christodoulos; Filippas, Dionysios; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos (IEEE, 2023-09)
      The widespread proliferation of deep learning applications has triggered the need to accelerate them directly in hardware. General Matrix Multiplication (GEMM) kernels are elemental deep-learning constructs and they ...
    • Article  

      Exploiting data encoding and reordering for low-power streaming in systolic arrays 

      Peltekis, Christodoulos; Filippas, Dionysios; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos (Elsevier, 2023-10-03)
      Systolic Array (SA) architectures are well-suited for accelerating matrix multiplications through the use of a pipelined array of Processing Elements (PEs) communicating with local connections and pre-orchestrated data ...
    • Article  

      Low-Power Data Streaming in Systolic Arrays with Bus-Invert Coding and Zero-Value Clock Gating 

      Peltekis, Christodoulos; Filippas, Dionysios; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos (IEEE, 2023-07-17)
      Systolic Array (SA) architectures are well suited for accelerating matrix multiplications through the use of a pipelined array of Processing Elements (PEs) communicating with local connections and pre-orchestrated data ...
    • Article  

      Reduced-Precision Floating-Point Arithmetic in Systolic Arrays with Skewed Pipelines 

      Filippas, Dionysios; Peltekis, Christodoulos; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos (IEEE, 2023-07-07)
      The acceleration of deep-learning kernels in hardware relies on matrix multiplications that are executed efficiently on Systolic Arrays (SA). To effectively trade off deep-learning training/inference quality with hardware ...
    • Article  

      Streaming Dilated Convolution Engine 

      Filippas, Dionysios; Nicopoulos, Chrysostomos; Dimitrakopoulos, Giorgos (IEEE, 2023-01-09)
      Convolution is one of the most critical operations in various application domains and its computation should combine high performance with energy efficiency. This requirement is critical both for standard convolution and ...