• Article  

      ArrayFlex: A Systolic Array Architecture with Configurable Transparent Pipelining 

      Peltekis, Christodoulos; Filippas, Dionysios; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos; Pnevmatikatos, Dionisios (IEEE, 2023-06-02)
      Convolutional Neural Networks (CNNs) are the state-of-the-art solution for many deep learning applications. For maximum scalability, their computation should combine high performance and energy efficiency. In practice, the ...
    • Article  

      The Case for Asymmetric Systolic Array Floorplanning 

      Peltekis, Christodoulos; Filippas, Dionysios; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos (IEEE, 2023-09)
      The widespread proliferation of deep learning applications has triggered the need to accelerate them directly in hardware. General Matrix Multiplication (GEMM) kernels are elemental deep-learning constructs and they ...
    • Article  

      Exploiting data encoding and reordering for low-power streaming in systolic arrays 

      Peltekis, Christodoulos; Filippas, Dionysios; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos (Elsevier, 2023-10-03)
      Systolic Array (SA) architectures are well-suited for accelerating matrix multiplications through the use of a pipelined array of Processing Elements (PEs) communicating with local connections and pre-orchestrated data ...
    • Article  

      IndexMAC: A Custom RISC-V Vector Instruction to Accelerate Structured-Sparse Matrix Multiplications 

      Titopoulos, Vasileios; Alexandridis, Kosmas; Peltekis, Christodoulos; Nicopoulos, Chrysostomos; Dimitrakopoulos, Giorgos (IEEE, 2024-03)
      Structured sparsity has been proposed as an efficient way to prune the complexity of modern Machine Learning (ML) applications and to simplify the handling of sparse data in hardware. The acceleration of ML models - for ...
    • Article  

      Low-Power Data Streaming in Systolic Arrays with Bus-Invert Coding and Zero-Value Clock Gating 

      Peltekis, Christodoulos; Filippas, Dionysios; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos (IEEE, 2023-07-17)
      Systolic Array (SA) architectures are well suited for accelerating matrix multiplications through the use of a pipelined array of Processing Elements (PEs) communicating with local connections and pre-orchestrated data ...
    • Article  

      Multi-Armed Bandits for Autonomous Test Application in RISC-V Processor Verification 

      Dimitrakopoulos, Giorgos; Kallitsounakis, E.; Takakis, Zacharias; Stefanidis, Apostolos; Nicopoulos, Chrysostomos (IEEE, 2023-07-17)
      Multi-armed bandit problems have recently received a great deal of attention, because they adequately formalize so called exploration-exploitation trade-offs arising in several relevant applications of recommendation ...
    • Article  

      Reduced-Precision Floating-Point Arithmetic in Systolic Arrays with Skewed Pipelines 

      Filippas, Dionysios; Peltekis, Christodoulos; Dimitrakopoulos, Giorgos; Nicopoulos, Chrysostomos (IEEE, 2023-07-07)
      The acceleration of deep-learning kernels in hardware relies on matrix multiplications that are executed efficiently on Systolic Arrays (SA). To effectively trade off deep-learning training/inference quality with hardware ...
    • Article  

      Streaming Dilated Convolution Engine 

      Filippas, Dionysios; Nicopoulos, Chrysostomos; Dimitrakopoulos, Giorgos (IEEE, 2023-01-09)
      Convolution is one of the most critical operations in various application domains and its computation should combine high performance with energy efficiency. This requirement is critical both for standard convolution and ...