EN FR
EN FR
ROMA - 2015
New Software and Platforms
New Results
Bibliography
New Software and Platforms
New Results
Bibliography


Section: New Results

Which Verification for Soft Error Detection?

Participants : Leonardo Bautista-Gomez [Argonne National Laboratory] , Anne Benoit, Aurélien Cavelan, Saurabh K. Raina [Jaypee Institute of Information Technology] , Yves Robert, Hongyang Sun.

This work is an extension of the work described in Section  7.4 to cope with imperfect verifications. Many methods are available to detect silent errors in high-performance computing (HPC) applications. Each comes with a given cost and recall (fraction of all errors that are actually detected). The main contribution of this work is to characterize the optimal computational pattern for an application: which detector(s) to use, how many detectors of each type to use, together with the length of the work segment that precedes each of them. We conduct a comprehensive complexity analysis of this optimization problem, showing NP-completeness and designing an FPTAS (Fully Polynomial-Time Approximation Scheme). On the practical side, we provide a greedy algorithm whose performance is shown to be close to the optimal for a realistic set of evaluation scenarios.

This work has been published in the proceedings of HiPC'15 [21] .