Browsing by Subject "Program processors"
Now showing items 1-20 of 28
-
Conference Object
Data parallel acceleration of decision support queries using cell/BE and GPUs
(2009)Decision Support System (DSS) workloads are known to be one of the most time-consuming database workloads that processes large data sets. Traditionally, DSS queries have been accelerated using large-scale multiprocessor. ...
-
Article
Data-Driven Thread Execution on Heterogeneous Processors
(2017)In this paper we report our experience in implementing and evaluating the Data-Driven Multithreading (DDM) model on a heterogeneous multi-core processor. DDM is a non-blocking multithreading model that decouples the ...
-
Conference Object
DDM-CMP: Data-driven multithreading on a chip multiprocessor
(2005)High-end microprocessors achieve their performance as a result of adding more features and therefore increasing their complexity. In this paper we present DDM-CMP, a Chip-Multiprocessor using the Data-Driven Multithreading ...
-
Conference Object
Disconnected quark loop contributions to nucleon observables using Nf = 2 twisted clover fermions at the physical value of the light quark mass
(Proceedings of Science (PoS), 2015)We compute the disconnected quark loops contributions entering the determination of nucleon observables, by using a Nf = 2 ensemble of twisted mass fermions with a clover term at a pION mass mπ = 133 MeV. We employ exact ...
-
Conference Object
Energy efficient stream-based configurable architecture for embedded platforms
(2012)Reconfigurable hardware can be used as an energy and performance efficient co-processing solution to accelerate certain types of applications. To facilitate the design of hardware accelerators we have proposed a methodology ...
-
Article
Evaluation of disconnected quark loops for hadron structure using GPUs
(2014)A number of stochastic methods developed for the calculation of fermion loops are investigated and compared, in particular with respect to their efficiency when implemented on Graphics Processing Units (GPUs). We assess ...
-
Article
Evaluation of fermion loops applied to the calculation of the η ′ mass and the nucleon scalar and electromagnetic form factors
(2012)The exact evaluation of the disconnected diagram contributions to the flavor-singlet pseudo-scalar meson mass, the nucleon σ-term and the nucleon electromagnetic form factors is carried out utilizing GPGPU technology with ...
-
Conference Object
Exploring graphics processor performance for general purpose applications
(2005)Graphics processors are designed to perform many floating-point operations per second. Consequently, they are an attractive architecture for high-performance computing at a low cost. Nevertheless, it is still not very clear ...
-
Article
A Family of Resource-Bound Real-Time Process Algebras
(2006)The Algebra of Communicating Shared Resources (ACSR) is a timed process algebra which extends classical process algebras with the notion of a resource. It takes the view that the timing behavior of a real-time system depends ...
-
Article
Fine-grain parallelism using multi-core, Cell/BE, and GPU systems
(2012)Currently, we are facing a situation where applications exhibit increasing computational demands and where a large variety of parallel processor systems are available. In this paper we focus on exploiting fine-grain ...
-
Conference Object
Fine-grain parallelism using multi-core, cell/BE, and GPU systems: Accelerating the phylogenetic likelihood function
(2009)We are currently faced with the situation where applications have increasing computational demands and there is a wide selection of parallel processor systems. In this paper we focus on exploiting fine-grain parallelism ...
-
Article
A flexible personalization architecture for wireless Internet based on mobile agents
(2002)The explosive growth of the Internet has fuelled the creation of new and exciting information services. Most of the current technology has been designed for desktop and larger computers with medium to high bandwidth and ...
-
Conference Object
How to compare the performance of two SMT microarchitectures
(Institute of Electrical and Electronics Engineers Inc., 2001)In this paper we discuss methods and metrics for comparing the performance of two simultaneous multithreading microarchitectures. We identify conditions under which the instructions-per-cycle metric may be misleading for ...
-
Conference Object
Implicit-storing and redundant-encoding-of-attribute information in error-correction-codes
(2013)This paper proposes implicit-storing to extend the logical capacity of a memory array without increasing its physical capacity by leveraging the array's error-correction-codes to infer the implicitly stored bits. ...
-
Article
Initial experiences porting a bioinformatics application to a graphics processor
(2005)Bioinformatics applications are one of the most relevant and compute-demanding applications today. While normally these applications are executed on clusters or dedicated parallel systems, in this work we explore the use ...
-
Conference Object
Modeling program predictability
(IEEE Comp Soc, 1998)Basic properties of program predictability - for both values and control - are defined and studied. We take the view that program predictability originates at certain points during a program's execution, flows through ...
-
Conference Object
Modeling the implications of DRAM failures and protection techniques on datacenter TCO
(IEEE Computer Society, 2015)Total Cost of Ownership (TCO) is a key optimization metric for the design of a datacenter. This paper proposes, for the first time, a framework for modeling the implications of DRAM failures and DRAM error protection ...
-
Conference Object
Modeling value speculation
(IEEE Computer Society, 2002)Several studies of speculative execution based on values have reported promising performance potential. However, virtually all microarchitectures in these studies were described in an ambiguous manner, mainly due to the ...
-
Conference Object
A parallel implementation of a multi-objective evolutionary algorithm
(2009)Multi-objective Evolutionary Algorithms (MOEAs) have features that can be exploited to harness the processing power offered by modern multi-core CPUs. Modern programming languages offer the ability to use threads and ...
-
Conference Object
The performance vulnerability of architectural and non-architectural arrays to permanent faults
(2012)This paper presents a first-order analytical model for determining the performance degradation caused by permanently faulty cells in architectural and non-architectural arrays. We refer to this degradation as the performance ...