SOFTWARE IMPLEMENTED HARDWARE-TRANSIENT FAULTS DETECTION
DOI:
https://doi.org/10.47839/ijc.5.1.377Keywords:
Processor's hardware transient - bit-errors detection, low-cost software approachAbstract
This paper examines a software implemented self-checking technique that is capable of detecting processorregisters' hardware-transient faults. The proposed approach is intended to detect run-time transient bit-errors in memory and processor status register. Error correction is not considered here. However, this low-cost approach is intended to be adopted in commodity systems that use ordinary off-the-shelf microprocessors, for the purpose of operational faults detection towards gaining fail-safe kind of fault tolerant system.References
M-C Hsueh, T.K. Tsai, R.K. Iyer, "Fault Injection Techniques and Tools," IEEE Computer, pp. 75-82, April 1997.
K.H. Huang, J.A. Abraham, "Algorithm - Based Fault Tolerance for Matrix Operations," IEEE Transactions on Computers, vol 33, Dec 1984, pp. 518-528.
S. Yau, F. Chen, "An Approach to Concurrent Control Flow Checking," IEEE Transactions on Software Engineering, vol. SE-6, No. 2, March 1980, pp. 126-137.
M. Zenha Rela, H. Madeira, J.G. Silva, "Experimental Evaluation of the Fail-Silent Behaviour in Programs with Consistency Checks," Proc. FTCS - 26, 1996, pp. 394-403.
T.L. Criswell, et.al, "Single Event Upset Testing with Relativistic Heavy Ions," IEEE Trans Nucl. Sci., vol. 31, no. 6, 1984, pp. 1559-1562.
Roger L. Tokhcim, “Theory and Problems of Microprocessor,” McGraw Hill Book Company, 1997.
Stephen B. Wicker “Error Control Systems for Digital Communication and Storage,” Prentice Hall, NJ, USA, pp.72- 127, 1995.
A. Benso, P.L. Civera, M. Rebaudengo, M. Sonza Reorda, "An Integrated HW and SW Fault Injection Environment for Real -Time Systems," Proc. IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, pp. 117-122, 1998.
T.P. Ma, P. Dussendorfer, "Ionizing Radiation Effects in MOS Devices and Circuits," Wiley, N.Y., 1989.
Jean-Claude Laprie, Jean Arlat, Christian Beounes, Karama Kanoun, "Hardware- and Software - Fault Tolerance: Definition and Analysis of Architectural Solutions," Proc. 17th International Sympsium Fault-Tolerant Computing, Computer Society Press, Los Alamitos, Calif., pp. 116-121, 1987.
Goutam Kumar Saha, "Transient Software Fault Tolerance Using Single-Version Algorithm," ACM Ubiquity, vol.6(28), ACM Press, USA, August, 2005.
T. Sato and I. Arita, "Tolerating Transient Faults in Microprocessors," 13th Joint Symposium on Parallel Processing, 2001, Japan.
Downloads
Published
How to Cite
Issue
Section
License
International Journal of Computing is an open access journal. Authors who publish with this journal agree to the following terms:• Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
• Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
• Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.