MPI-FT: Portable fault tolerance scheme for MPI
Ημερομηνία
2000Source
Parallel Processing LettersVolume
10Issue
4Pages
371-382Google Scholar check
Keyword(s):
Metadata
Εμφάνιση πλήρους εγγραφήςΕπιτομή
In this paper, we propose the design and development of a fault tolerant and recovery scheme for the Message Passing Interface (MPI). The proposed scheme consists of a detection mechanism for detecting process failures, and a recovery mechanism. Two different cases are considered, both assuming the existence of a monitoring process, the Observer which triggers the recovery procedure in case of failure. In the first case, each process keeps a buffer with its own message traffic to be used in case of failure, while the implementor uses periodical tests for notification of failure by the Observer. The recovery function simulates all the communication of the processes with the dead one by re-sending to the replacement process all the messages destined for the dead one. In the second case, the Observer receives and stores all message traffic, and sends to the replacement all the buffered messages destined for the dead process. Solutions are provided to the dead communicator problem caused by the death of a process. A description of the prototype developed is provided along with the results of the experiments performed for efficiency and performance.
Collections
Cite as
Related items
Showing items related by title, author, creator and subject.
-
Conference Object
Mobile commerce: Vision and challenges (location and its management)
Samaras, George S. (Institute of Electrical and Electronics Engineers Inc., 2002)Mobile computing is distributed computing that involves elements whose location changes in the course of computation. Elements may be software components such as mobile agents or moving objects-data or hardware such as ...
-
Article
Computer-aided diagnosis in hysteroscopic imaging
Neofytou, Marios S.; Tanos, Vasilios; Constantinou, Ioannis P.; Kyriacou, Efthyvoulos C.; Pattichis, Marios S.; Pattichis, Constantinos S. (2015)The paper presents the development of a computeraided diagnostic (CAD) system for the early detection of endometrial cancer. The proposed CAD system supports reproducibility through texture feature standardization, ...
-
Article
Grid Computing : Second European AcrossGrids Conference, AxGrids 2004, Nicosia, Cyprus, January 28-30, 2004. Revised Papers
European Across, Grids Conference; Dikaiakos, Marios D.; European Across, Grids Conference; Dikaiakos, Marios D.; SpringerLink (Online service); European Across, Grids Conference (2004)