This article will look at Cardano's current scalability and consider how much Input Endorsers can improve it. We will not cover Hydra or other options for scaling Cardano. The closest improvement to the scalability of the first layer will be Input Endorsers (although there may be some partial improvement before that).
Cardano's current Scalability
Transactions Per Second (TPS) can tell us how many people can use the network in a given time period. TPS is the maximum network throughput.
For Cardano, TPS is not a suitable metric. The accounting model based on UTxO allows sending multiple assets to multiple recipients in a single transaction. Such transactions can send assets to 100 recipients and are significantly smaller than if the network had to process 100 individual transactions.
Look at the type of transactions that Cardano currently processes. You will find that roughly 60% are SC transactions, 35% are simple transactions (Alice sends assets to Bob) and 5% are transactions with metadata.
For simplicity, we will use the TPS metric in the article, and in most cases, we consider only simple transactions.
Calculating TPS is relatively easy. For a rough estimate, we only need to know the block size, the average transaction size, and the block frequency.
Cardano mints a new block every 20 seconds. The maximum size of a block and a transaction are defined by the following protocol parameters:
- Max block size: 90,112 bytes
- Max TX size: 16,384 bytes
A simple Cardano transaction can have a size of 250 to 500 bytes. Most of the time you will see a transaction of 300 bytes. SC transactions or transactions with multiple inputs or outputs are of course larger.
The more inputs and outputs a transaction has, the larger it is. Usually, one large input is taken, from which two outputs are created. One output is received by the recipient and the other output returns part of the assets back to the sender. The transaction must contain a witness (100-200 bytes).
If we consider that there will only be transactions with a size of 300 bytes in the block, 300 of them would fit in the block. If we divide this number by 20 seconds (the frequency of block minting), we get a TPS of 15.
If we were to consider that there would only be huge SC transactions with a size of 16,000 bytes in the block, roughly 6 of them would fit into the block. TPS would be only 0.3.
However, one large SC transaction could contain roughly 250 recipients, so 1,500 recipients would be served in one block. If we calculated TPS not by transactions, but by recipients, we would get 75.
Considering the current type of transactions, the current maximum throughput of Cardano is roughly 40-50 recipients per second. This is roughly 3x more than if we only considered TPS and simple transactions.
It is necessary to realize that the size of the Cardano block is relatively small compared to other blockchains.
Bitcoin blocks have a theoretical maximum size of 4 megabytes (SegWit). A more realistic maximum size of the block is 2 megabytes. The average is around 1.6 megabytes for the last year. Bitcoin mines new blocks on average every 10 minutes. Bitcoin can handle roughly 7 TPS.
Ethereum mints new blocks on average every 12 seconds. The block size of Ethereum is not fixed (not limited by block size in bytes) but rather depends on the amount of GAS used by the transactions included in each block. The GAS limit (complexity of the execution of transactions) determines the block size. It can be dynamically adjusted. Currently, the GAS limit is set to 15M. In recent months, block sizes range from 140-170 kilobytes. Ethereum mints about 7,200 blocks per day and confirms about 1M transactions. There are about 140 transactions in blocks. The average Ethereum transaction is about 1000 bytes. Currently, Ethereum is operating near its capacity limit and its TPS is 12.
Algorand has a block size of 5 MB and a block time of 3.3 seconds. It can do 6000 TPS and the team plans to improve that to 10,000.
Bitcoin has a long block time, but thanks to this it can have a large block size. TPS is primarily limited by block time. Ethereum has roughly a half lower block time than Cardano and at the same time roughly 2x larger block size. Still, the current TPS is 12. If we counted only simple transactions of say 500 bytes, the TPS could be around 30. That would be roughly 2x more than Cardano.
Algorand has a large block size and at the same time a very low block time. This is one of the reasons for the high TPS. We will talk more about this project in connection with Input Endorsers.
The size of Cardano blocks could very likely be increased to 180 kilobytes and the block minting frequency could be set to 15 seconds without a negative impact on performance. In that case, TPS would be 40, slightly higher than Ethereum.
However, an increase of 10 TPS would not make a significant difference. Blockchains must be able to scale in the order of thousands of TPS within a few years. Some networks are said to already be able to do it. Cardano's mission requires getting to similar numbers. Can Cardano get there through Input Endorsers?
Input Endorsers
Input Endorsers can increase the throughput and speed of Cardano, as transactions can be streamed constantly without waiting for consensus. Instead of having one block that contains transaction data, Cardano will have three types of blocks: ranking blocks, endorsement blocks, and input blocks. Transactions will only be in input blocks. Endorsement blocks will reference multiple input blocks.
This is a very similar concept that Algorand already has implemented. It is called block pipelining. Algorand uses the concept of transaction references and it is one of the reasons for the high TPS (of course it is not the only reason). Before we get into Input Endorsers, let's briefly explain how block pipelining works in Algorand.
The Algorand network randomly selects a committee of users for each block, who then propose and vote on the block in a single round.
There are two types of blocks in Algorand: key blocks and microblocks. Key blocks are used to achieve consensus on the network. Besides other things (information about the proposer, the committee, etc.), key blocks reference multiple microblocks. Microblocks are used to store transaction data.
The committee only votes on key blocks, not microblocks. Microblocks are verified by participating nodes before they are included in the key block.
Algorand only includes references to the state changes in the key blocks, rather than the entire state of the ledger. A reference is just a 32B hash of the state changes that occurred in a block. Hash is much smaller than storing the whole state. This reduces the size of the key blocks and allows for faster propagation and verification of the blocks across the network.
Input Endorsers and block pipelining have many similarities:
- Both features split the block into two parts: one for consensus and one for transactions.
- Both features enable constant streaming of transaction blocks, regardless of the consensus process.
- Both features aim to increase the throughput and speed of the network by reducing the block propagation times and allowing higher transaction rates.
It can be said that the IOG team implements a similar solution that already works in practice. On the other hand, there are also differences in both solutions. The biggest difference is probably that Algorand’s block pipelining relies on a single layer of endorsement blocks, while Cardano’s Input Endorsers rely on a hierarchical structure of input, endorsement, and ranking blocks.
Let's describe the blocks that will be used in Cardano after the Input Endorsers feature is delivered. We describe it from the top (the network consensus) to the bottom (data).
- Ranking blocks are used to achieve consensus on the Cardano network. They are similar to the current blocks, except that they do not contain any transaction data. Instead, they contain a reference to a set of endorsement blocks that are compatible with each other. In addition, every ranking block contains a signature from the block producer who created it. Ranking blocks are produced by slot leaders, who are randomly selected by the protocol according to their stake. Ranking blocks are responsible for maintaining the security and finality of the blockchain. They can be produced every 15-30 seconds.
- Endorsement blocks contain a reference to a single input block, along with a signature from the input endorser who created it. They are produced and streamed by input endorsers and they are subject to a validity check by the block producers. Endorsement blocks have a parent block, which is the last ranking block on the chain, and they can have multiple (compatible) child blocks. They can be produced every 5-10 seconds.
- Input blocks contain the transaction data. They are produced and streamed by input endorsers, who are randomly selected stakeholders that can choose transactions from the mem-pool and propagate them through the network. Input blocks do not have any parent or child blocks, and they do not participate in the consensus process. They are simply a way of broadcasting transactions to the network. They can be produced every 0,2-2 seconds.
Endorsement blocks contain a reference to only one input block. An endorsement block can be referenced by other endorsement blocks that are compatible with it.
For example, if an endorsement block EB-1 references an input block IB-1, and another endorsement block EB-2 references an input block IB-2, and both IB-1 and IB-2 have no conflicting transactions, then EB-2 can reference EB-1 as its parent block. This way, EB-2 becomes a child block of EB-1, and both EB-1 and EB-2 are compatible with each other. This allows for the creation of a tree-like structure of endorsement blocks, where each branch represents a different set of transactions that can be included in the ledger via the next ranking block.
The purpose of having multiple child blocks for an endorsement block is to increase the chances of finding a compatible set of endorsement blocks for each ranking block. A ranking block can reference a set of endorsement blocks that are compatible with each other, meaning that they have no conflicting transactions in their referenced input blocks. By having multiple child blocks for an endorsement block, the block producer can choose the best branch that maximizes the number of transactions that can be included in the ledger.
We will explain Input Endorsers in more detail in another article (including pictures). In this article, we wanted to clarify the basic concepts.
The most important thing to note is that blocks with transactions can be minted (streamed) essentially constantly (every 0.2-2 seconds). This means that instead of creating one data block every 20 seconds, 10 to 100 input blocks can be minted during the same time. The high frequency of minting input blocks does not hinder network consensus.
Now let's focus on the potential increase in scalability.
One Cardano block can hold 300 simple transactions. If the block time for input blocks is set to 2 seconds, 10 blocks will be created within 20 seconds, which corresponds to 3000 transactions. If it is set to 0.2 seconds, there will be 100 blocks with 30,000 transactions.
So TPS will rise to 150 to 1500.
If the network layer made it possible to increase the size of input blocks, it would have a positive impact on TPS. When increasing the input block to 180 kilobytes, Cardano could have a TPS of around 300 to 3000.
Increasing scalability depends on many other network qualities and technologies, such as diffusion pipelining, Mithril, etc. The Input Endorsers feature can be used more effectively if other individual network properties can be improved.
It is important to mention that the TPS we present are indicative only. In practice, there will be a conflict between transactions, so it is possible that only part of the endorsement blocks will get into the ranking blocks in given rounds.
The potential increase in scalability that Input Endorsers can bring is not only about input blocks (and the number of transactions in them) but also about ranking blocks.
The ranking blocks are subject to a validity check by the block producers, who must ensure that they follow the protocol rules and consensus parameters. The validity check will be more demanding on resources as it will be necessary to verify more transactions. The ranking blocks also have a limited size and minting frequency. It determines how many references can be included in a ranking block and how often they can be produced (15-30 seconds). Scalability is therefore limited not only by input blocks but also by ranking blocks.
Blockchain networks cannot generate an unlimited number of transactions, as this generates a huge amount of data. Data availability is one of the crucial aspects of blockchain scalability. In a decentralized network, the validity check takes place on nodes that are all over the world. Data must be available for them to make the check.
The larger the block size, the more transactions can be processed in the block, but also more bandwidth and storage are required from the nodes. The shorter the block time, the faster the transactions can be confirmed, but also the higher the risk of forks and orphaned blocks.
Teams must carefully balance decentralization and scalability. Increasing scalability might have a negative impact on decentralization, as it can reduce the number and diversity of nodes that can participate in the network consensus, and increase the power of certain nodes over others.
Conclusion
Input Endorsers may not be the last improvement to the Cardano protocol. It is often said that sharding is essential for high scalability. However, sharding also has its disadvantages, resulting mainly from the need for communication and synchronization between shards. Cardano could have sharding after Input Endorsers are implemented. One shard could have a TPS roughly like the Cardano network with Input Endorsers. So a Cardano with 10 shards could have a TPS of around 20,000. However, this is very far in the future and a lot of technological details would have to be worked out. For example, sharding at the storage level, improving Mithril and having light clients, thinking about pruning (throwing away old transactions), etc.