How to compare the performance of two SMT microarchitectures

Sazeides, Yiannakis; Juan, T.

doi:10.1109/ISPASS.2001.990697

Conference Object

Date

2001

Author

Sazeides, Yiannakis
Juan, T.

ISBN

0-7803-7230-1
978-0-7803-7230-6

Publisher

Institute of Electrical and Electronics Engineers Inc.

Source

2001 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2001
IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2001

Pages

180-183

Google Scholar check

Keyword(s):

Distributed computer systems

Computer architecture

Initial conditions

Program processors

Reconfigurable hardware

Economic and social effects

Multitasking

Trade off

Micro architectures

Instruction per cycles

Instructions per cycles

Load balance

Performance metrices

Simultaneous multi-threading

Metadata

Show full item record

Abstract

In this paper we discuss methods and metrics for comparing the performance of two simultaneous multithreading microarchitectures. We identify conditions under which the instructions-per-cycle metric may be misleading for comparing two simultaneous multithreading microarchitectures for the same amount of work. Part of the problem is isolated to the definition of what is same work When simulating a mix of independent programs under the same initial conditions on two different simultaneous multithreading microarchitectures there are two approaches to ensure the work of the two runs is same: constant-work-per-thread or variable-work-per-thread. For both approaches the total number of instructions in the run is constant, however, for the first, the instructions from each thread is also constant, whereas for the second is not. We claim that: (a) when simulating two microarchitectures with the constant-work-per-thread approach, the instructions-percycle is sufficient to compare them to establish the microarchitecture with the best performance, (b) when variable-work-per-thread approach is used the instruction-per-cycle may be inadequate for comparing performance. We attribute this to the inability of the instructions-per-cycle metric to account for differences in the load-balance of the two runs. A new performance metric, SMT-speedup, is proposed that enables accurate comparison of the performance of two simultaneous multithreading microarchitectures for runs with different load-balance. The new metric considers the load-balance in terms of the size and performance of each thread. In light of the insight gain in this paper we contend that a simultaneous multithreading microarchitecture may need to trade-off throughput and load-balance to achieve the best performance. © 2001 IEEE.