dc.contributor.author | Radhakrishnan, H. | en |
dc.contributor.author | Rouson, D. W. I. | en |
dc.contributor.author | Morris, K. | en |
dc.contributor.author | Shende, S. | en |
dc.contributor.author | Kassinos, Stavros C. | en |
dc.creator | Radhakrishnan, H. | en |
dc.creator | Rouson, D. W. I. | en |
dc.creator | Morris, K. | en |
dc.creator | Shende, S. | en |
dc.creator | Kassinos, Stavros C. | en |
dc.date.accessioned | 2019-05-06T12:24:24Z | |
dc.date.available | 2019-05-06T12:24:24Z | |
dc.date.issued | 2015 | |
dc.identifier.uri | http://gnosis.library.ucy.ac.cy/handle/7/48745 | |
dc.description.abstract | This paper summarizes a strategy for parallelizing a legacy Fortran 77 programusing the object-oriented (OO) and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP) facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were done using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summationwith a parallel, binary tree algorithm.We also tested theCray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015. Copyright © 2015 Hari Radhakrishnan et al. | en |
dc.language.iso | eng | en |
dc.source | Scientific Programming | en |
dc.subject | Memory architecture | en |
dc.subject | Binary trees | en |
dc.subject | Microprocessor chips | en |
dc.subject | Object oriented programming | en |
dc.subject | Distributed Memory | en |
dc.subject | FORTRAN (programming language) | en |
dc.subject | Fortran compilers | en |
dc.subject | Many-core accelerators | en |
dc.subject | Model verification | en |
dc.subject | Multi-core processor | en |
dc.subject | Multicore programming | en |
dc.subject | Parallel application | en |
dc.subject | Parallel programming | en |
dc.subject | Performance tests | en |
dc.subject | Profile analysis | en |
dc.subject | Program compilers | en |
dc.title | Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study | en |
dc.type | info:eu-repo/semantics/article | |
dc.identifier.doi | 10.1155/2015/904983 | |
dc.description.volume | 2015 | |
dc.author.faculty | Πολυτεχνική Σχολή / Faculty of Engineering | |
dc.author.department | Τμήμα Μηχανικών Μηχανολογίας και Κατασκευαστικής / Department of Mechanical and Manufacturing Engineering | |
dc.type.uhtype | Article | en |
dc.contributor.orcid | Kassinos, Stavros C. [0000-0002-3501-3851] | |
dc.gnosis.orcid | 0000-0002-3501-3851 | |