next up previous
Next: Oral Messages Up: Fault-Tolerant Distributed Consensus Previous: Failure Detectors

Byzantine Failures in Synchronous Systems

In the last section, we saw that there is no totally correct consensus protocol for asynchronous processes. To have any kind of fault-tolerance, then, we must make assumptions about the system or the kinds of faults we can expect. We therefore assume a synchronous system of processes, in which we can detect the lack of a message through some means (for example, time-outs).

In what is probably the most famous use of anthropomorphism in distributed computing, Lamport et al. [12] introduced the paradigm of the Byzantine Generals Problem as a model for the consensus problem in light of faulty processes which send false messages.

The worst kinds of faults are those that follow a malevolent plan. They are the most disruptive because if there exists a certain scenario in which the consensus protocol will fail, we must assume that the malevolent processes will act out that scenario. If we can develop an algorithm to handle malevolent faults, then this algorithm will also be able to handle random, fail-stop, or any other kinds of faults.

We can discuss such an algorithm by using the Byzantine Generals Problem. A city is surrounded by Byzantine generals and their forces. The generals must agree on a plan of action; that is, they must agree whether to attack the city or to retreat. Some of these generals are traitorous and want to stop the loyal generals from coming to a consensus (or worse, make them each ``agree'' on a different plan). Any general can send a message to any other general directly, and can send oral or written messages through other generals. (Oral messages can be changed by intermediate generals, whereas written ones are signed and cannot be modified.) Messengers are reliable.

It is important to realize here that we are not trying to find out who is traitorous and who is not, but we are only trying to get the loyal generals to agree on a value.

The question now is, ``How many traitorous generals are necessary before consensus among the loyal generals is impossible?'' The question yields a different answer for oral and for written messages.





next up previous
Next: Oral Messages Up: Fault-Tolerant Distributed Consensus Previous: Failure Detectors



Lawrence Kesteloot
Fri Jan 20 16:38:32 EST 1995