Cardano and sharding

Published 28.6.2023

Cardano is a monolithic blockchain and upgrading Ouroboros Leios won't change that. Many blockchain project teams have implemented sharding in an effort to achieve higher scalability. Cardano could also, in theory, one day have sharding. However, this will definitely not happen before the implementation of Ouroboros Leios. Let's explore the differences between a monolithic and a sharded blockchain from a consensus perspective. The topic is complex, so we will deliberately focus on only some aspects. For the sake of simplicity for the reader, the article does not aim to provide complete and comprehensive information.

Challenges for monolithic and sharded blockchain

A monolithic blockchain handles all the core components of the system, such as consensus, data availability, and execution, in the same layer or space. All nodes participating in the network consensus share a single space in which they jointly validate transactions and blocks.

A sharded blockchain partitions the system into smaller subgroups called shards. Each shard can handle part of transactions independently and in parallel. The nodes are divided between the shards. If there were 1000 nodes and 10 shards in the network, 100 nodes would be distributed to each shard according to specific rules.

However, the shards are not independent of each other as they are still part of one system. Each system, whether monolithic or using sharding, must maintain a single valid global state.

The global state is the representation of the ownership and transfer of assets on the blockchain. The global state is stored and updated by the nodes that participate in the consensus. Don't forget that one of the key innovations that blockchain brings is protection against double-spend attacks. What's tricky is managing this task within a distributed network. For a centralized server, this is an easy task.

From a security perspective, all nodes in the network must be aware of the global state.

In a monolithic blockchain, the global state is stored on all nodes, making it easy to maintain data consistency and completeness at any given moment. However, this places demands on nodes in terms of resources (storage, bandwidth, computation). A monolithic system can tend towards centralization, have higher node performance requirements, and be very difficult (if not nearly impossible) to achieve high scalability.

In a sharded blockchain, it is more difficult to maintain the consistency of the global state as it is maintained in individual shards. Each shard has only partial knowledge of the global state as it does not have all the data. There must be some synchronization mechanism (protocol) to ensure that the global state is consistent across shards. This brings complexity and challenges from a security perspective. On the other hand, it is relatively easy to achieve high scalability as transactions are validated in shards. However, the number of shards is limited by the need for cross-shard communication.

A monolithic blockchain can have a simpler consensus and is more transparent. In the case of Cardano, this will not be true after the upgrade to Ouroboros Leios, as there will be 3 versions of the blocks with different timings. Another advantage is easier security thanks to higher data availability. It is easier to ensure resistance to double-spend and replay attacks. It is significantly easier to ensure security, thanks to higher data availability. The monolithic blockchain is highly resistant to double-spend and replay attacks.

The biggest challenge for monolithic blockchains is achieving fast finality of transactions (and blocks) without sacrificing decentralization and security. Complexity (higher communication requirements) can grow with the number of nodes, limiting scalability.

Let us add that Cardano uses Nakamoto-style consensus, i.e. probabilistic finality. Thus, transaction finality is slow compared to networks with provable finality (Ethereum).

The biggest advantage of sharded blockchain is the high scalability achieved through parallelization for transaction verification. In other words, transaction validation is distributed among several partially independent groups of nodes (shards). This has additional advantages as it reduces the storage, bandwidth, and computational power requirements of individual nodes. The network can be more decentralized and it may be more economical to run your own node. The disadvantage is complexity, especially in terms of synchronization and management of shards, cross-shard communications, conflict resolution between shards, etc. The biggest challenge for teams building sharding is achieving high security and reliability in a relatively complex system that depends on the Internet.

Cross-shard communication overhead

Let's now look at the sharded blockchain from the perspective of assets and applications. It is obvious that if there are only a limited number of applications and assets per the shard (in the extreme case, a single application and a single token type per shard), the user-friendliness and usability of the system would suffer. It doesn't make sense to have, for example, only ADA coins in one shard, HOSKY in a second shard, and DJED in a third shard. How could a decentralized exchange function in such an environment? In which shard would you place it?

This problem is solved by cross-shard communication. It is the process that allows shards to exchange information and coordinate actions in a system. Cross-shard transactions involve multiple shards. It is necessary to ensure atomicity and consistency across shards. Let’s demonstrate it via a few simple examples.

If Alice wants to send token X from shard 1 to Bob on shard 2, the system needs to ensure that the transaction is valid, final, and consistent on both shards and that Alice cannot double-spend token X on other shards. As you can see, the nodes in one shard are not able to validate this transaction and declare it as final (complete). It is necessary for nodes from both shards to validate the transaction. It is also necessary to ensure that token X is not spent in the other shards. Other nodes (from other shards) should also have at least partial knowledge of the state of token X. Remember when I talked about the global state?

In the case of applications, it is very similar. Cross-shard smart contracts require communication across multiple shards and data (or logic) from other shards. For example, if a DEX smart contract (deployed on shard 3) wants to swap token X from shard 1 with token Y from shard 2, the system needs to ensure that the smart contract can access and verify the data and state of token X and token Y on their respective shards and that the swap is executed atomically and consistently across shards.

Now try to imagine what all this means in the context of scalability, time synchronization of shards, storage (data availability), communication (bandwidth), conflict resolution, attack prevention, keeping the global state, etc. Cross-shard synchronization is the process that ensures that all shards have a consistent and up-to-date view of the global state of the system.

The dependence of the shard on other shards for the validation of cross-shard transactions (or smart contracts) is an unwanted (but necessary) feature since the shard is not fully autonomous (independent of its surroundings). If one shard has problems (for example, due to an attack, network issues, or lower performance), it can affect the other shards, i.e. the whole system. The network consensus must therefore be designed very carefully and take these eventualities into account.

There is one big challenge for teams trying to implement consensus sharding. What if a large (over half) number of transactions require cross-shard communication? In this case, the sharded blockchain might not be as efficient as initially anticipated. Fortunately, sharded blockchains already exist, so we can observe and compare individual implementations.

Transaction finality in the context of scalability

To understand the topic, it is necessary to understand transaction finality in the context of scalability. Finality means that once a transaction is confirmed by the network, it cannot be reverted or changed. This ensures the security and integrity of the system and prevents double-spending or replay attacks.

Finality affects the performance and efficiency of the system, as it reduces the latency and overhead of waiting for confirmations or resolving conflicts.

Simply put, until transaction X is final, the asset that was transferred by transaction X has no certain owner (it can be the original sender or a new recipient). Obviously, using this asset (with an uncertain owner) again for another transaction Y is “risky” because if the previous transaction X is reverted, the transaction Y (someone is just trying to send) should also be reverted. The problem could be chained. It is difficult for blockchain to revert a single transaction so it would be necessary to revert the entire block. This would significantly break the consistency of the data.

Finality is important for scalability in general but particularly for sharded blockchains since it enables faster and simpler cross-shard transactions. If the transactions are not final, they may create inconsistencies or conflicts among shards, which can affect the scalability and security of the system. For example, if Shard 1 confirms a transaction that transfers token X from Alice to Bob, but Shard 2 does not confirm it yet, Alice may try to spend token X again on Shard 2, causing double spending. To prevent this, the system needs to ensure that the transaction is final on both shards before allowing Alice or Bob to use token X on other shards.

Another reason why finality is important for scalability is that it enables more efficient and flexible consensus protocols. Consensus protocols are the rules that determine how the nodes agree on the state of the blockchain and resolve conflicts. If the transactions are not final, they may create forks or reorgs, which can affect the scalability and security of the system.

For example, if node A confirms a block that contains transaction T1, but node B confirms a different block that contains transaction T2, they may create a fork in the blockchain, which can cause confusion or inconsistency. To resolve this, the system needs to use a consensus protocol that can handle forks or reorgs (e.g. longest chain rule). However, these protocols can be slow, costly, or complex, which can limit the scalability and performance of the system.

A sharded blockchain cannot operate efficiently and reliably unless fast transaction finality (provable finality) is ensured, not only within shards but also for cross-shard communication.

How to calculate system throughput?

It is difficult to estimate which system may have a higher throughput. The problem is that transactions cannot be validated independently, i.e. in parallel. There are some differences between UTXO and account-based blockchains, but we'll get to that. If cross-chain communication is required to validate transactions, it introduces overhead and latency which reduces throughput. It is not possible to calculate the total throughput of the sharded system by multiplying the shard throughput by the number of shards.

Similarly, it is not possible to calculate the throughput of monolithic systems by multiplying the processing capacity per validator by the number of validators. Validators may have different processing capacities, so it is possible that the least powerful validator will affect the overall system performance. It also depends on the specific consensus.

Achieving high scalability (and also fast finality) in a monolithic blockchain is not easy. There are many challenges and teams must carefully balance between trade-offs. Faster finality usually requires faster block production and propagation, which can compromise the security and decentralization of the system (risk of forks or reorgs). Faster finality usually requires more sophisticated protocols or mechanisms (many nodes have to participate in the voting within the production of each new block), which can compromise the simplicity and transparency of the system. Consensus with a fast finality can require more communication or synchronization among nodes, which can increase the latency or overhead.

Sharded blockchains currently have higher throughput (at least on paper), but their reliability will only become apparent at higher system loads when cross-shard transactions need to be processed.

How do the accounting model and consensus affect scalability?

What can affect throughput is, surprisingly to some, the accounting model. The account-based model used by Ethereum and most SC platforms does not allow for parallel transaction processing. It is necessary to maintain transaction order during validation (the system maintains a shared global state). In other words, transactions are interdependent. When validating a transaction, it is necessary to consider the global state which must be immutable (atomicity) at the time of validation.

The account-based model requires sequential processing of transactions within but also across shards, because each account may depend on (or conflict with) other accounts or transactions. Parallelization through shards can improve the throughput of the system, but only if cross-shard communication is handled efficiently.

Cardano uses the Extended-UTXO (or simply UTXO/eUTXO) model which allows parallel processing of transactions. Transactions are not dependent on each other during validation. Their order in the block does not matter.

The UTXO model allows for more parallel processing of transactions within and across shards because each UTXO is independent and can be verified without reference to other UTXOs (or accounts).

Cardano could potentially become a sharded blockchain, but first of all, the team needs to implement fast finality (provable finality). With probabilistic finality, it makes no sense to consider sharding. At least from our point of view. Once Ouroboros Leios is implemented, sharding can be considered.

Let's go back to the finality of transactions and blocks. Finality is based on voting (approving) transactions by nodes. Once a transaction is approved by a certain percentage of nodes in the network, it becomes irreversible. The finality of blocks (and therefore transactions) in the Cardano network is now slow as the weight increases with each new block added to the blockchain. Voting by just 10% of nodes in the Cardano network (adding a new block and thus approving all previous ones) can take up to 1 hour.

Conclusion

The scalability of decentralized blockchains (L1) is a very complex topic. The team and the Cardano community should consider implementing sharding, but this cannot be done without increasing the finality of transactions. We have to wait for the implementation of Ouroboros Leios and then, maybe, it will make sense to start thinking about sharding. The UTXO accounting model is suitable for sharding as it allows parallelization not only in shards but also for cross-shard communication. Sharding is a major technological challenge when you consider maintaining decentralization, security, reliability, and other things like the fairness of rewards. If the second layers do not take hold as the current effort to increase scalability, we will have no choice but to increase the scalability of the first layers. Finally, there are other interesting ways to increase scalability besides sharding.

Featured:

Related articles

Did you enjoy this article? Other great articles by the same author