Understanding the Nakamoto Consensus

Understanding the Nakamoto ConsensusCardano uses a variation of the Nakamoto consensus called Ouroboros. It is a Proof-of-Stake (PoS) consensus that uses the underlying principles invented by Satoshi Nakamoto for Bitcoin. Ouroboros is designed to provide similar security guarantees as Proof-of-Work (PoW) while being more energy-efficient. In the article, we will explain the basic principles of the Nakamoto consensus and highlight the differences between PoW and PoS. The focus will be on the principles and underlying mechanisms.The Nakamoto consensusThe Nakamoto Consensus was invented by Satoshi Nakamoto, the pseudonymous creator of Bitcoin. It is a solution to the Byzantine Generals Problem, which asks if it’s possible to achieve consensus in a distributed network of independent nodes.Ouroboros is designed to provide similar security guarantees as Bitcoin while being more energy-efficient. Bitcoin's PoW and Cardano's PoS share common principles of the Nakamoto consensus, despite the difference in their mechanisms and implementations.The primary difference between PoW and PoS lies in how they implement these principles.PoW requires miners to solve complex cryptographic puzzles using computational power to add new blocks to the blockchain.PoS selects the producers of blocks (pools) to mint new blocks based on the amount of ADA coins they own or have been delegated to them.The fundamental difference is in the expensive resources that PoW and PoS consensus mechanisms use.PoW needs an enormous amount of hash rate (computing power) to operate. This consumes a lot of energy. Electricity is an expensive and renewable resource. The quantity is unlimited.PoS is more energy efficient because digital coins are used as a resource. Electricity is consumed only to operate the nodes. ADA coin is an expensive, non-renewable, and scarce resource.The use of different expensive resources affected the implementation of the principles of the Nakamoto consensus. Both PoW and PoS consensuses use different mechanisms to achieve block production timing, randomness, security, decentralization, inclusiveness, egalitarianism, etc.PoW and PoS differ in individual elements, but both consensuses are very similar regarding the principles of the Nakamoto consensus.We will not deal with all the elements in the article. We mainly focus on explaining how consensus is reached. Before we get into it, let's explain the necessary theory.What Is Consensus For?The purpose of consensus in a distributed network is to ensure that all participating nodes, which may be spread across different locations, agree on a single state of the network despite the lack of a central authority.This agreement allows the network to maintain a consistent and reliable state of the ledger. The network must reach an agreement at regular intervals, which allows the ledger's state to be changed.In blockchain networks, the state changes by adding a new block. User transactions are inserted into the blocks. In the case of Cardano, certificates can also be inserted into blocks.What do we mean by consensus?Consensus can refer to a set of rules enabling nodes within a distributed network to achieve mutual agreement (through the mutual exchange of information and making autonomous decisions). These rules are encoded in the source code of the network protocol. A protocol is essentially a blueprint, and the client is the software that implements this blueprint. By installing the client on their computer, users become nodes that contribute to the distributed (decentralized) network’s operations.

Consensus may also specifically refer to the implementation of this agreement process, such as through PoW or PoS mechanisms.So, consensus can be the network's ability to reach an agreement, but also a specific implementation. The same term can refer to an ability and a tool at the same time.In the picture, you can see clients using Ouroboros PoS consensus which allows them to reach a mutual agreement on changing the state of the ledger. Note that the ledger is part of every client. State changes are happening on all nodes.

Network Consensus RequirementsBefore we start explaining the Nakamoto consensus, it is necessary to explain some important aspects that are imposed on all kinds of network consensus for distributed networks.The consensus mechanism must ensure that the network operates correctly even in the presence of faulty or malicious nodes. Malicious actors who want to disrupt the ability to reach consensus and thereby damage the network can join the open network at any time. Several nodes in the network can go offline at the same time for various reasons.In a decentralized network, anyone who owns an expensive resource (hash rate in Bitcoin or a coin in Cardano) can gain a share of power (i.e. participate in reaching an agreement).The reliability and security of the network are based on the assumption that honest participants hold a larger portion of expensive resources than malicious actors need to disrupt the consensus. Another assumption is that if many nodes suddenly go offline, it will still be possible to reach a consensus. Not every type of consensus can provide this.The network should be able to handle failures and errors, both in the network and in the participating nodes, without compromising the overall functionality of the system. The network must be fault-tolerant.All non-faulty nodes must eventually agree on the same state of the ledger, ensuring consistency and integrity across the entire system. The network must not get into a situation where the state splits into two different states (blockchain can fork into two competing chains) without being able to deal with this situation.You can see in the picture that the network maintained a uniform state of the ledger up to block N+3. After that, the status is ambiguous. There is a pair of blocks with the same height N+4 and N+5.At a given moment, it is not possible to determine which ledger’s state is valid.

Protocol rules must be prepared for every eventuality and ensure state consistency. In case of a state split, the network must be able to detect the problem and deterministically decide which of the states is the correct one (longer chain rule).In the picture, you can see that the network dropped the lower blocks N+4 and N+5 (they were orphaned). New blocks N+6 and N+7 were added behind the upper block N+5. This chain won and represents the current (agreed) state of the ledger.

Every non-faulty process must eventually reach a decision, ensuring the consensus protocol will eventually conclude. In other words, the network must not get into a situation where it does not know how to proceed with consensus. Continuity must be ensured.If the network stopped after adding a pair of N+5 blocks and was not able to add another block in such a way as to decide which state is the right one, a centralized entity (team) would have to intervene. This is undesirable.Safety and LivenessSafety and liveness are two critical properties that ensure the network operates correctly and securely. Teams building network consensus must balance these properties. They have to decide which one they prefer more. Let's describe these properties.Safety refers to the guarantee that the network will not reach a false agreement. In other words, it ensures that any transaction deemed final by one properly-operating node will eventually be deemed final by every properly-operating node. It also means that no two transactions ever deemed final by two properly-operating nodes will ever conflict.In the extreme case, this can be interpreted as the network preferring stopping the consensus to incorrect settlement of transactions. In a network that prioritizes safety over liveness, avoiding forks is crucial because a fork implies that there is a disagreement on the state of the ledger, which can lead to safety violations.Discarding transactions is highly undesirable for such a network because it might undermine the trust and reliability of the system. Users expect their transactions to be final and immutable once they are included in a block and appended to the blockchain.You can see this undesirable situation in the picture. The network created two blocks N+4, and there are different transactions in each of the blocks (small red and blue boxes). The network must discard one of the N+4 blocks including transactions.

These are different requirements from networks that prefer liveness over safety.Liveness is the guarantee that the network will continue to make progress and not stall. This means that as long as there are transactions that have not been finalized, the set of finalized transactions will continue to grow. It ensures that every transaction will eventually be settled by all honest nodes.A network that prioritizes liveness over security doesn't stop when it forks. Instead, it allows two chains to exist temporarily and relies on network participants to continue building on top of the chain they believe is the correct one. In other words, an inconsistent state of the ledger is temporarily tolerated.In the picture, you see a similar situation as above. The rules of the network allow for this situation. The lower block N+4 is discarded (including the red transaction). A block N+5 is added into which the red transaction was inserted. Both transactions are ultimately in the blockchain.

I hope you know which property prefers a network using the Nakamoto consensus. Cardano and Bitcoin both prioritize liveliness over safety.This design choice is evident in the way the consensus handles the forks of the blockchain. In Nakamoto consensus, nodes are instructed to follow the longest chain when a fork occurs. This rule ensures that the network continues to make progress and extend the blockchain, even if there are temporary disagreements or forks.However, this approach can lead to temporary safety violations, such as when two miners mine a block with the same height around the same time, leading to a fork. In Cardano, two slot leaders can be elected around at the same time.The network will eventually converge on one of these blocks (chains) as part of the longest chain rule, but until then, there could be conflicting views of the transaction history. The safety property is eventually restored as the consensus continues in adding new blocks. One chain becomes significantly longer than the others, making it the universally accepted chain.While the Nakamoto consensus aims to provide both safety and aliveness, it is designed to ensure that the network remains alive and capable of processing transactions even in the face of temporary disagreements or network issues. Safety is achieved as the blockchain grows and the probability of a deep reorganization becomes negligible.Probabilistic FinalitySafety and liveness properties are related to the probabilistic finality of the Nakamoto consensus.Probabilistic finality refers to the concept that the finality of a transaction is not absolute at the time it is inserted into a new block. The finality of a transaction becomes increasingly likely as more blocks are added on top of the block containing the transaction.Safety and liveness relate to probabilistic finality in the following way:Safety means that once a transaction has been included in a block and several blocks have been added on top of it (deepening its position in the chain), the likelihood of that transaction being reversed becomes very low. However, it’s not zero. There’s always a small chance that a longer competing chain could emerge, although the probability decreases exponentially with each additional block.Liveness ensures that transactions will eventually be confirmed and added to the blockchain. As long as pools (producers of blocks) continue to extend the blockchain by adding new blocks, the network exhibits liveness. Transactions are not guaranteed to be included in the next block, but they will be included eventually as long as they are valid and the network continues to operate.Compared to other network consensus that prefer safety over liveness, the Nakamoto consensus has a slow settlement. The user has to wait for the network to add several blocks on top of the block in which his transaction was inserted. The transaction is irreversible only after adding more blocks. Every other newly added block (on top of the block with the user’s transaction) represents an agreement with the state of the ledger. We'll get to the details later.The picture shows that Alice's transaction (small blue box) was inserted into block N+3. The finality of the transaction is 0, as no further block has been added yet. With each additional block added, the finality of the transaction (and also of the block N+3) increases. Once the N+4 block is added, the finality is 1, and so on. If Alice needs 3 blocks as a sufficient number of confirmations (3 other block producers agree to block N+3), she can consider the transaction finalized (irreversible) in block N+6.

Note: The finality of transactions can be seen as binary values. So the transaction is final (written forever in the blockchain) or not (yet). That is why we often talk about the number of confirmations. Each added block after block with your transaction represents one more confirmation. In block N+6, the transaction has 3 confirmations. If Alice considers this to be a sufficient number of confirmations for a transaction with a smaller amount, she can consider the transaction finalized. Blockchain reorganization can still happen but with less probability.Networks preferring safety over liveness can gather state consent more quickly. That is, not through adding blocks, but through some form of voting either before or shortly after adding a block. This kind of consensus usually requires a large number of nodes to actively participate in voting on each block.The finality of transactions can be achieved quickly only if it is possible to get approval for the new state (block) from the majority of nodes in a short time.Let's Produce A New Block In A Nakamoto WayWe said that consensus is about agreement across nodes about changing the state of the ledger. The Nakamoto consensus could be simply described as follows.In a given time interval, let's randomly elect one node in the network that gets the right to produce a new block. This block will be broadcast to the network. If another randomly elected node agrees with this block, it will append its new block on top of this (latest) block. If it doesn't agree, it appends a new block on top of the previous block (so not after the latest block).Let's describe the following picture.Alice added block N+2. This block will be received by other nodes in the network, so Bob and Carol will also receive it. Bob is randomly elected as the next block producer. He appends block N+3 (in green) after block N+2.Carol receives block N+3 that has been produced by Bob. She is randomly elected as the next block producer. Carol has two options. She can append block N+4 after the existing block N+3. This is the expected scenario (since there would be no inconsistency in the ledger). But she doesn't like Bob's block N+3. She therefore decided to append block N+3 (in red) after block N+2.This will create a fork in the blockchain (the state of the ledger is now inconsistent). Another randomly elected block producer can decide whether to append a new block after the red or green block N+3.

Two functions are key to adding blocks. Determining when a new block is to be added and who is producing it. So, protocol timing and node randomization functions are required.Agreement across nodes occurs only after a new state is proposed. Other nodes agree to the change with a long delay by attaching a new block after the previous block. Agreement cannot be obtained from all nodes at the same time, but only from a single node - the one that will be chosen as the next block producer. Achieving agreement is a gradual process.Nodes do not agree with each other on a state change before it is supposed to happen. Instead, a randomly elected node authoritatively proposes the change and assumes that others will agree. Other nodes will most likely agree if the block is valid.The block producer is incentivized to propose a valid block, as this is the only way to be entitled to a reward.In each round, there is one proposer and many approvers. Nodes in the network accept a valid block because they have no reason to discard it. If nodes were to discard valid blocks, for example, all blocks produced by Alice, forks would arise in the network (and thus ledger inconsistencies). This is undesirable.The process of adding a new block consists of the following steps:

Random election of block producer.

The production of a new block by a randomly elected node.

Broadcasting the block to the network by the producer.

Validation of the block and its possible acceptance.

Same process again from step one.

The process is the same for both Cardano and Bitcoin. Although each project uses a different consensus, they are not fundamentally different. Let's describe the differences.The differences are in protocol timing and random election of block producers.Bitcoin is designed to produce a new block roughly every 10 minutes. The network adjusts the mining difficulty approximately every two weeks (2016 blocks) to maintain this block time. This adjustment ensures that the time it takes to produce a block remains consistent, even as the network’s hash rate (the total computational power used for mining) changes.The random election is based on solving a computationally demanding mathematical task (cryptographic puzzles). Nodes that want to mine a new block (pools) start solving tasks the moment they receive a new block. All nodes start at roughly the same time (network delay). The node that solves the task first immediately creates a new block and broadcasts it to the network.Note that all pools solve the same task, but only one can succeed. As soon as the pool receives a new valid block, it stops working on the current task and starts solving a new task (mining a new block).Sometimes it happens that a task solves two pools at about the same time. In such a case, a fork of the blockchain will occur, which will be resolved by the mechanism described above.In Bitcoin, the process of adding a new block consists of the following steps:

All pools (with miners) start mining a new block.

One pool finds the solution to the cryptographic puzzle.

Pool creates a block and broadcasts it to the network.

Other nodes and pools validate the block.

Same process again from step one (if the block was valid).

Cardano divides time into one-second slots and uses modern cryptography, namely the Verifiable Random Function (VRF), for random selection.Instead of all nodes solving a computer-intensive mathematical operation, the so-called slot leader is decided by mathematics and random inputs.Each node verifies in each slot whether it has become the slot leader. If so, it mints a new block. There is no central authority to control the voting. Nodes register for voting through certificates that are stored in the blockchain. Nodes can verify that they have become slot leaders completely autonomously.Each node needs to calculate its threshold number. It is derived from the size of the stake. The stake consists of ADA coins of the operator (pledge) and all stakers. The larger the stake, the more blocks the pool might produce in a given epoch. This is the same principle we observe in Bitcoin. The more hash rate is delegated to a pool, the more blocks the pool will mine.Every second, every pool employs the VRF algorithm to get a VRF output. The VRF output is compared with the threshold. If the VRF output is less than the threshold, the pool has become the slot leader and gets the right to mint a new block.

In Cardano, the process of adding a new block consists of the following steps:

In each slot, all pools verify if they have become the slot leader.

The slot leader mints a new block and broadcasts it to the network.

Other nodes and pools validate the block.

Same process again from step one (if the block was valid).

As you can see, it is the same process as in the case of Bitcoin.Similar to Bitcoin, it can happen that several slot leaders are elected in 20 seconds (block time of Cardano). It is even possible that 2 slot leaders will be elected in the same slot. Similar to Bitcoin, the longer chain rule is used to resolve the fork of the blockchain.The output of the VRF election is a number that decides which chain to follow in the event of a blockchain fork.One of the other differences is the incentive model. Bitcoin rewards pools (miners) in each new block. Cardano rewards all stake pool operators and stakers once every 5 days (epoch).We would find many similar minor differences.For example, when validating a block. In the case of Bitcoin, the proof that the mathematical task was solved is verified. In the case of Cardano, proof that the block comes from a node that was elected as slot leader is verified (other things are verified, such as the KES signature).In both cases, it is ensured that the state of the ledger is not arbitrarily changed by a fraudulent node. In Bitcoin, a fraudulent node would have to expend enormous computing power to create a new fraudulent block. In Cardano, a fraudulent node would have to break the cryptography, which would also require enormous computing power.Both approaches have similar security guarantees. Through cryptography, Cardano can more effectively randomly elect slot leaders in a given time interval and ensure that no one other than the elected slot leader can append a new block to the blockchain. Bitcoin needs to consume an enormous amount of energy for the same.ConclusionNakamoto consensus has slow transaction finality but is very robust. If a large number of nodes go offline, the network is still able to produce blocks (albeit at a slower rate).Cardano and Bitcoin differ in many aspects such as decentralization, security budget, egalitarianism, inclusiveness, etc. Nakamoto consensus cannot directly affect, for example, the quality of decentralization. In the Bitcoin network, more than 50% of the blocks are produced by only two pools, while in the Cardano network, there are more than 1000 pools. This is due to a different reward mechanism (the concept of pool saturation is important). Bitcoin does not care if there is a single block producer or 1000. The Cardano protocol knows its decentralization and can economically incentivize its growth.Decentralization is a key feature of a distributed network. The primary goal of network consensus is to enable agreement across a relatively large number of block producers (they can be in the order of hundreds to thousands). The quality of individual properties of the network is determined by the mix of various elements and details. Nakamoto consensus is one of the key elements for both Bitcoin and Cardano, but there are others.The rewarding mechanism, inclusiveness, and egalitarianism influence decentralization. This is where PoW and PoS fundamentally differ from each other. Unlike mining, staking has a smaller entry barrier, is less risky, and is accessible to everyone in the world. This is just one of many differences.