Tfluxscc: A case study for exploiting performance in future many-core systems
PublisherAssociation for Computing Machinery
SourceProceedings of the 11th ACM Conference on Computing Frontiers, CF 2014
11th ACM International Conference on Computing Frontiers, CF 2014
Google Scholar check
MetadataShow full item record
The number of computational units integrated in a single processor is rapidly increasing. This suggests that applica-tions will require effcient and effective ways to exploit the parallelism to achieve the performance offered by large-scale multicore processors. The effcient parallelization of the ap-plications relies on the programming and execution models. On the one hand, the programming model must address the effort needed to extract parallelism for such processors. On the other hand, the execution model must handle the high levels of parallelism from the applications while effciently exploiting the resources of the processors. In this work we use the Data-Flow model to achieve high levels of paral-lelism in an effort to scale the performance on the 48-core Intel Single-chip Cloud Computing (SCC) processor. We propose TFluxSCC, a software platform for execu-tion of Data-Flow applications on the Intel SCC processor. TFluxSCC is based on the TFlux Data-Driven Multithread-ing (DDM) platform that was developed for commodity mul-ticore systems. What we propose in this work is an effcient implementation of the DDM model on a clustered many-core that is used as a case study to achieve high degree of par-allelism. With TFluxSCC we achieve scalable performance in a cluster of many simple cores using global address space without the need of cache-coherency support. Our scalabil-ity study shows that applications can scale, with speedup results ranging from 30x to 48x for 48 cores. The ndings of this work provide insight towards what a Data-Flow imple-mentation requires from many-cores and what it can offer to these processors in order to scale the performance. © 2014 ACM.