Fault-tolerant computation in groups and semigroups: Applications to automata, dynamic systems and Petri nets
AuthorHadjicostis, Christoforos N.
Verghese, G. C.
SourceJournal of the Franklin Institute
Google Scholar check
MetadataShow full item record
The traditional approach to fault-tolerant computation has been via modular hardware redundancy. Although universal and simple, modular redundancy is inherently expensive and inefficient. By exploiting particular structural features of a computational architecture or an algorithm, arithmetic codes and recently developed algorithm-based fault tolerance (ABFT) techniques manage to introduce "analytical redundancy" and offer more efficient fault coverage at the cost of narrower applicability and harder design. In this paper, we extend a variety of results and constructive procedures that were developed in previous work for computations that take place in an abelian group to a more general setting that considers computations in semigroups. We demonstrate possible encodings for semigroup operations of interest and use our extension to design concurrent error detection and correction schemes for group and semigroup machines. The method provides insight regarding the role of decomposition in fault-tolerant algebraic machines and results in a general, hardware-independent characterization of concurrent error detection and correction in finite semiautomata. We also demonstrate that by extending this approach to other dynamic systems, with specific hardware implementations and failure modes, we can systematically obtain fault-tolerant architectures. More specifically, we apply these techniques to linear time-invariant dynamic systems and Petri net models of discrete event systems. © 2002 The Franklin Institute. Published by Elsevier Science Ltd. All rights reserved.
Showing items related by title, author, creator and subject.
Hadjicostis, Christoforos N. (2004)This paper presents coding techniques that can be used to provide fault tolerance to a parallel prefix computation that is performed on a binary tree of processing nodes. More specifically, we discuss how a parallel prefix ...
Hadjicostis, Christoforos N.; Verghese, G. C. (2000)We use unreliable system replicas and unreliable voters to construct redundant dynamic systems that tolerate transient failures in their state transition and error correcting mechanisms. Using low density parity check ...
Sundaram, S.; Hadjicostis, Christoforos N. (2008)This paper develops a framework for performing fault-tolerant convolution via error-correcting codes based on the chinese remainder theorem (CRT) with non-coprime moduli. In contrast to convolution that is protected through ...