Why BFT protocols cannot have 1/3 dishonest parties
Principal Software Engineer - 25 September 2018 -
Principal Software Engineer - 25 September 2018 -
Any blockchain must organize the transactions in a linear order of the time of their execution, which may be different from their time of submission. Instead of trying to achieve consensus on individual transactions, it is more efficient to group them in blocks which are a finite ordered list of transactions. If a block is accepted as valid and a consensus has been achieved, that means the same thing as achieving a consensus on all of the included transactions in the mentioned order. Blocks are ordered in a sequence which is aptly named as a block-chain or simply blockchain. The chain has a beginning called the genesis block. Everyone starts off with the belief that the genesis block is valid.
The distance of any block from the genesis block in term of the number of blocks in-between is called the height of the block. The height of the genesis block can be set as zero. The height of the first block after that is one, and so on. The job of the blockchain protocol is to make sure that every honest party agrees on the block that is acceptable given any particular height.
At the heart of any blockchain, is a system that needs to agree on the system state. When multiple actors with equal privilege need to maintain a set of records, they must agree on whether some record is part of such database. This is very straightforward in a perfectly connected system with only honest parties, but it gets complicated with unreliable network and dishonest parties trying to gain from the insecurities of the system. More specifically, since honest parties cannot guarantee communication with the other honest parties and neither can they tell between honest and dishonest parties, a trivial protocol will let a dishonest party convince two different honest parties two different states of the system that cannot be true together. For example, an attacker in Bitcoin may try to make an honest party believe that he paid 1 BTC to A and make another honest party believe that he paid 1 BTC to B, where he only had 1 BTC to begin with. Such a system would allow the dishonest party to receive services from both A and B each worth 1 BTC by spending only 1 BTC altogether. In blockchain terms, such a transaction is called a double spend and a blockchain would always intend to stop such attacks. Getting multiple honest parties to agree on the correct state of the system while some dishonest parties are present is called a consensus algorithm. We will look closely into a particular kind of consensus protocol.
Bitcoin and Ethereum achieve consensus by something called a proof of work. Basically, any party can declare himself to be a leader deciding which one of the several possible correct sets of transactions is valid by solving a cryptographic puzzle ensuring some processing has been done by such party. Since there is a large reward for such a party successfully being selected as the leader, it creates a competitive environment where several parties compete to be chosen as a leader. However, since a leader is not allowed to pack conflicting/invalid transactions in the block (a block is an ordered list of valid transactions proposed by the leader), this creates only one valid chain of transactions. However, this allows some temporary disagreement between different honest parties (also known as a temporary fork), which get resolved as time goes on. It is not difficult to come by resources explaining PoW based blockchain, the reader can simply read such articles to know more unless he is already familiar with such concepts. In this blog, we dive deep into a different kind of consensus algorithm – byzantine fault tolerance or BFT.
In a BFT or a Byzantine Fault Tolerant consensus protocol, each participating party sends messages to other parties in a prescribed manner until they come to a conclusion about a certain candidate block. There is the requirement that once an honest party is able to decide a certain transaction to be valid and agreed upon, no other honest party would deny that. Each honest party confirms a consensus by making sure that a large majority of other parties (honest or dishonest) agrees to the same block for that particular height. The consensus algorithm is run fresh for each height, one after another.
The basic idea is as follows –
For every height –
Note that each party needs to know the public key of the other parties to validate their signatures. Otherwise, one single dishonest party can send the P-messages on behalf of everyone to create a consensus of his choice.
The system cannot of course work with an arbitrary number of dishonest parties, there must be a limit for the maximum number of dishonest parties the system can be protected against. In the following section, we will discuss this limit.
Maximum number of dishonest users for achieving consensus
Before we get to limits, we need to state the conditions the protocol needs to satisfy:
That being said, we try to figure out the maximum number of dishonest user any system can support to achieve consensus. Let us assume that any honest user needs to see P messages for the same block for the given height from a t fraction of the total number of parties. That is if there are a total of N parties, the honest users would look for more than t/N P messages supporting the same block before they commit the block. Let us also assume that z is the maximum fraction of dishonest parties that the protocol can support.
We first see what is required to achieve condition 1. We already assumed that the attacker has full control of the network. We also assume that the dishonest parties make a coalition among themselves to form a combined attacker. The aim of the attacker is to make two different honest users commit two different blocks. The attacker partitions the network into three groups. One containing x/N honest users (we will call this partition A), one containing the rest of the honest users (partition B) and the last partition containing all the dishonest parties (partition C). This means the fraction of parties in partition B is (1-x-z). Note that the attacker can control the network and all delivery of messages, but cannot create a P message on behalf of an honest user. Now the attacker tries to make both of those partitions commit different blocks for the same height.
Since the attacker can have z fraction of P messages for both blocks, the total fraction of P messages the attacker can achieve for a block in partition A is x+z. Similarly, the total fraction of P messages the attacker can create in partition B is z+(1-x-z) = 1-x. Since any BFT protocol needs to make sure that conflicting blocks cannot be committed, at least one of them must be less than or equal to t.
Since the attacker can choose any value for x, it must be so that for any value of x either x+z ≤ t or 1-x ≤ t.
The reader can now note that the logical condition (A or B) to be true means either A has to be true or B has to be true. This means if A is false, then B must be true. This is the same thing as (not A) ⟹ B.
So, for all x, (x+z ≤ t or 1-x ≤ t) is the same thing as saying for all x, x+z > t ⟹ 1-x ≤ t.
This can be simplified as x+z > t ⟹ x+z ≥ 1-t+z (1) .
When stated this way we can see that this condition would be met if t ≥ 1-t+z or 2t -1≥ z or z ≤ 2t-1 (2).
That would get what we want. But we have to prove that this condition is really necessary.
Suppose it is not true that z ≤ 2t-1; so, it must be true that z> 2t-1.
We will create a condition where x+z > t but x+z < 1-t+z thus violating (1). The minimum possible value for x+z is of course z and maximum is 1. The attacker can choose any value for x such that x+z is in that range. Suppose, m = max(z, t) and the attacker chooses x = (m + 1-t+z)/2 -z so that x + z = (m + 1-t+z)/2 .
We have assumed that z > 2t-1 , so 1-2t + z > 0 or 1-t+z > t > 0 (3) . Hence, since m = max(z, t), x + z = (m + 1-t+z)/2 > z. This means such an x is admissible.
Also, 1-t+z > t (according to 3) and since t<1 (as it a fraction), 1-t+z > z (4). (3) and (4) together implies 1-t+z > max(z, t) =m or m < 1-t+z (5). Hence, x + z = (m + 1-t+z)/2 < ((1-t+z) + 1-t+z)/2 = 1-t+z.
Basically, since x + z is the mean of 1-t+z and m, and since 1-t+z > m, it must be that m < x+z < 1-t+z . Observe that m=max(z,t) we have t ≤ m < x+z < 1-t+z which achieves the contradiction. Hence, we have proved that it has to be the case that z ≤ 2t-1 according to (2).
Now, let us consider condition 2. It simply says that when the network is connected, the commit must be possible. So, the fraction of honest parties must be able to cause a commit. This means (x+(1-z-x)) = 1-z > t or z < 1-t . The following diagram shows the allowable values for z.
The fraction of dishonest user is marked in green. The maximum possible value for z is of course just below the intersection of the lines z = 1-t and z = 2t-1. We say ‘just below’ and not on the intersection because z < 1-t and so z cannot be on this line. If the intersection is at t=t0, we have 1-t0 = 2t0 -1 or 3t0 = 2 or t0 = ⅔ . Hence, z < 1-t0 = 1-⅔ = ⅓ . Hence the maximum fraction of dishonest users have to be less than ⅓ and that would be achieved when t= ⅔. So if there are f dishonest parties in the system, there must be at least 2f+1 honest parties, so that the total number of parties is 3f+1 which makes the number of dishonest parties strictly less
than the ⅓ fraction of the total number of parties. This also means the threshold here is floor(⅔(3f+1)) which is 2f+1 We will count the users in terms of f from now on.
Summary: The red line in the above graph is the safety line. The z below that can maintain the safety of the chain. The blue line is the liveness line. The z below that can maintain liveness of the chain. Since both must be maintained, only the green area is allowed for the value of z. The maximum value of z is, of course, the apex which is reached when z = ⅓. In fact, z must be less than ⅓ for reasons described earlier.