Byzantine generals problem

Omar Wells

January 31, 2022

In Web 1.0, one usually deals with “traditional” centralized systems. The system website consists of two parts, the browser (client) that displays the page and the server that delivers it to the client on request.

In the simplest configuration, the client is the one making a request to complete a task (“send me the website data!”), and a server is the one completing it (“send data!”). When there are several of these servers, it is called a decentralized system, because there are many “centrals”.

Distributed Systems 2.2: The Byzantine generals problem

The Byzantine Generals Problem - An Intro To Blockchain

The Byzantine Generals problem, and why it matters for the Internet

Contents

1. Web 2.0

2. Web 3.0

3. Unsuccessful Strategy

4. Successful strategy

5. The CAP Theorem

5.1. Available and Failure Tolerant (AP)

5.2. Consistent and Available (CA)

5.3. Consistent and Failure Tolerant (CP)

6. Consensus

Web 2.0

Web 2.0 is more about communication between people. Here, two clients want to communicate with each other. They also need a server “in the middle” to execute their requests respectively. When the Messenger app on the smartphone sends a message to another smartphone, that message is first transmitted to the Messenger server. The Messenger server notifies the recipient app. When the recipient looks at the message, their app sends a confirmation to the sending app via the server.

Both variants work well. However, the servers involved are outside the control of the clients (Facebook, WhatsApp, Google, hosted websites, etc). Behind a server could be a good company or an evil one. It could also be the mafia, an NGO or the government. The client can’t tell the difference and must rely on domain names and certificates to establish trust with a server.

Web 3.0

Web 3.0 is now trying to fix this “flaw”. Web 3.0 is understood to be peer-to-peer networking, that is, client-to-client communication without a server in between. The Bittorent protocol has impressively proven that peer-to-peer networking communication works. Client and server became an entity that is both client and server. However, this fact immediately gives rise to a new problem. How do I build trust with the other peer? Here, too, “anyone” could be located!

According to legend, Byzantine generals had a communication problem during the siege of Constantinople in 1453. Because of Constantinople’s strength, it was necessary to attack the city from several places at once. The generals were able to communicate through messengers. However, it was suspected that a few of the generals were traitors and were working with the Sultan in Constantinople. Therefore, a common agreement on the exact time of the attack was difficult because the traitors could spread false information through the messengers. However, if the attack was not simultaneous, Constantinople could not be conquered. The problem is actually difficult to solve.

There is a mathematical solution that works as long as no more than a third of the generals are traitors. I’ll leave those out of the equation for now.

Let’s assume that the attack is to take place at 5 p.m. on Wednesday. Any general can be the traitor. The messenger carries the message in an envelope, but does not know it.

Unsuccessful Strategy

General 1 sends his messenger with a message to General 2. General 2 reads and confirms the message (ok, we attack Monday at 5pm) and sends him on to General 3. General 3 is the traitor. He writes a new note with a different time without the messenger seeing it (ok, we attack Monday at 7pm). All other generals get the wrong time. The attack will fail!

Successful strategy

Under these conditions, when the messenger is sent by General 1, General 2 works exactly 10 minutes on its confirmation and adds it to the first message. There are now two papers in the messenger’s envelope that refer to each other (“chained”). If the third general now writes a different time on his confirmation, it will take 10 minutes first. Forging the other message would take another 10 minutes. However, this would be noticed by the messenger who was told that a reply would take about 10 minutes. And most importantly, the first general would notice if the messenger came back later than expected.

The CAP Theorem

The cars are in my yard and I have a list of all the vehicles. When I sell a car, I cross it off the list. My business is doing great and I open a used car dealership in the neighboring town as well. My friend Holger manages that and also has a list. We agree to put all the cars we have at the two locations on the list. When one of us sells a car, he calls the other one that the car is no longer available. We also have only one phone number that is forwarded to both phones. Everything works great!

Consistent: We are consistent because we always have our sales list on the same level.
Available: We are always available because we can be reached at one phone number. Either Holger picks up or I do. There is always someone there.
Partition Tolerance: We are failure tolerant because the system works even if one of us gets sick.

One day something happens and Holger is mad at me (I do not remember exactly what it was). He has no more desire and goes home, but tells me nothing and does not tell me that he sold two cars today. As luck would have it, someone shows up at my yard with cash and buys a car that Holger has already sold. He pays me in cash (cash in advance) and sets off to pick up the car from Holger’s yard … disaster takes its course!

Available and Failure Tolerant (AP)

Facebook and Twitter are always available (A) and failure tolerant (P). However, they are not consistent (C). The messages, comments, likes, and whatnot don’t reach all participants immediately, but are distributed throughout the system as they go.

Cloud computing and the domain name system on the Internet also fall into this category.

Consistent and Available (CA)

A relational database system should primarily be consistent and, if somehow possible, always available. Failure tolerance is not a given.

Consistent and Failure Tolerant (CP)

A bank that operates ATMs must ensure that the amount collected from the machine is reliably debited from the account. Availability is not as important in this case.

Consensus

Among the Byzantine generals, simple rules could be used to ensure that a true message could be distinguished from a fake message. This was achieved via forcing people to do something for 10 minutes. Consensus plays a very big role in blockchain technology. There are different approaches. The two best-known approaches are proof of work done (Proof of Work – as in generals) and proof of stake. Here, however, it gets technical again right away, so I’ll stop at this point for now.

Knowledge