Optimistically synchronized parallel discrete-event simulation is based on the use of communicating sequential processes. Optimistic synchronization means that the processes execute under the assumption that synchronization is fortuitous. Periodic checkpointing of the state of a process allows the process to roll back to an earlier state when synchronization errors occur. This paper examines the effects of varying the frequency of checkpointing on the time and space needed to execute a simulation.
The results presented in this paper were obtained from the simulation of closed stochastic queueing networks with several different topologies. Various process scheduling algorithms and message cancellation strategies are considered. The empirical results are compared with analytical formulae predicting time-optimal checkpoint intervals. It is shown that the time-optimal and space-optimal checkpoint intervals are not the same. Furthermore, a checkpoint interval that is too small adversely affects space more than time; whereas, a checkpoint interval that is too large adversely affects time more than space.
Copyright 1992 by Simulation Councils, Inc.
Full text. Presentation slides. BibTeX entry.