Fault-tolerant computation in groups and semigroups: Applications to automata, dynamic systems and Petri nets
Date
2002Source
Journal of the Franklin InstituteVolume
339Issue
4-5Pages
387-430Google Scholar check
Keyword(s):
Metadata
Show full item recordAbstract
The traditional approach to fault-tolerant computation has been via modular hardware redundancy. Although universal and simple, modular redundancy is inherently expensive and inefficient. By exploiting particular structural features of a computational architecture or an algorithm, arithmetic codes and recently developed algorithm-based fault tolerance (ABFT) techniques manage to introduce "analytical redundancy" and offer more efficient fault coverage at the cost of narrower applicability and harder design. In this paper, we extend a variety of results and constructive procedures that were developed in previous work for computations that take place in an abelian group to a more general setting that considers computations in semigroups. We demonstrate possible encodings for semigroup operations of interest and use our extension to design concurrent error detection and correction schemes for group and semigroup machines. The method provides insight regarding the role of decomposition in fault-tolerant algebraic machines and results in a general, hardware-independent characterization of concurrent error detection and correction in finite semiautomata. We also demonstrate that by extending this approach to other dynamic systems, with specific hardware implementations and failure modes, we can systematically obtain fault-tolerant architectures. More specifically, we apply these techniques to linear time-invariant dynamic systems and Petri net models of discrete event systems. © 2002 The Franklin Institute. Published by Elsevier Science Ltd. All rights reserved.
Collections
Cite as
Related items
Showing items related by title, author, creator and subject.
-
Article
MPI-FT: Portable fault tolerance scheme for MPI
Louca, Soulla P.; Neophytou, Neophytos; Lachanas, Adrianos; Evripidou, Paraskevas (2000)In this paper, we propose the design and development of a fault tolerant and recovery scheme for the Message Passing Interface (MPI). The proposed scheme consists of a detection mechanism for detecting process failures, ...
-
Article
Fault-tolerant dynamic systems
Hadjicostis, Christoforos N.; Verghese, G. C. (2000)We use unreliable system replicas and unreliable voters to construct redundant dynamic systems that tolerate transient failures in their state transition and error correcting mechanisms. Using low density parity check ...
-
Article
Nonconcurrent error detection and correction in fault-tolerant discrete-time LTI dynamic systems
Hadjicostis, Christoforos N. (2003)This paper develops resource-efficient alternatives to modular redundancy for fault-tolerant discrete-time (DT) linear time-invariant (LTI) dynamic systems. The proposed method extends previous approaches that are based ...