status

Exploring Scalability Options of the UTxO Model

Published 30.4.2024

The UTxO model enables parallel processing of transactions, a feature that is essential for achieving high scalability and fast finality of transactions. In this article, we will dive into various strategies for improving network consensus, which can lead to increased network throughput and faster transaction finality. In the article you will learn about Input Endorsers, sharding, and how protection against double-spend attacks limits scalability.

Basic Features Of The UTxO Model

UTxOs are independent and immutable objects. Once a UTxO is created as an output of a transaction, it remains unchanged until it is spent in a new transaction. When it is spent, it is consumed entirely and a new UTxO is created as an output of the new transaction.

The image illustrates the continuous and parallel submission of new transactions by users. Each transaction uniquely points to input UTxOs from the UTxO set, as indicated by the red arrows leading to the red UTxOs. The term STATE 1 represents the fresh global state that emerges on each node following the acceptance of a new block. The red UTxOs have been removed from the UTxO set. The UTxO set has been updated with the addition of newly formed UTxOs, depicted by the green arrows pointing towards the green UTxOs.

The UTxO model is pivotal in managing the global state during transaction validation. The global state in Cardano is represented by an active collection of UTxOs, also known as the UTxO set. With the addition of each new block, all newly created UTxOs are incorporated into the UTxO set. Conversely, UTxOs that are consumed (spent) by transactions are removed from the UTxO set. UTxOs can be perceived as disposable entities.

Every node in the network maintains its individual UTxO set. The majority of nodes adhering to the consensus possess an identical UTxO set (global state), as they preserve the same blockchain history, inclusive of the most recently appended block. Nevertheless, due to delays in network data transmission, the global state does not undergo instantaneous changes across all nodes simultaneously, but rather with a minor delay. It can be postulated that at the moment when a block producer node disseminates a new block into the network, all nodes share the same global state (meaning they received the last block and updated the global state accordingly).

How To Increase The Finality Of Transactions?

Cardano employs a consensus mechanism akin to Nakamoto-style consensus. This is typified by the probabilistic finality of blocks, and consequently, transactions. This implies that if a transaction is included in a new block, referred to as BLOCK 1, it has a confirmation count of zero. When a block is added, there exists a chance that this block may not persist in the blockchain as it could be supplanted by an alternative block created around the same time. The assurance that a block will endure in the blockchain escalates with the addition of new blocks beyond BLOCK 1. This is often referred to as an increase in the number of confirmations.

In the picture, you can see that after BLOCK 0 there was a fork of the blockchain. Two alternative BLOCK 1 were produced. After appending more blocks, the upper chain prevailed. In the upper chain, there was transaction TX 1.

Cardano exhibits low transaction finality. The block time in Cardano is configured to 20 seconds. This denotes that if your transaction necessitates 10 confirmations, you would have to wait approximately 200 seconds, which is roughly 3 minutes.

Thanks to the independence of the UTxOs within the UTxO set, the validation of transactions is also mutually exclusive. Each node can validate a transaction or a group of transactions. The sequence of transactions within a block is inconsequential, as they do not influence each other. Consequently, a node can not only validate individual transactions upon receipt but also append an approval or vote in some form before disseminating the transaction to the network. Every subsequent node that receives the transaction will also observe the approvals from the nodes that have received the transaction earlier.

In the picture, you can see how the TX 1 transaction gradually collects approvals on every other node to which it is diffused. When NODE 4 produces a new block, the transaction already has 4 approvals. Note that NODE 4 was the first to modify its mem-pool. NODE 4 removed the old input UTxO (red) from transaction TX 1 and inserted the newly created UTxO (green).

Transactions that have been in a node’s memory pool for a longer period will generally have more approvals and could be given higher priority when being added to a block. This is already a feature of Cardano transactions, as Cardano operates on a first-in, first-out basis. Crucially, if a transaction garners approval from the majority of nodes in the network, the likelihood of it being discarded decreases, even if the blockchain undergoes reorganization.

The process of reaching a consensus on the inclusion of transactions can occur more rapidly and independently of the production of new blocks.

Even if the block containing the transaction only has a few confirmations (possibly just one), there will already be a consensus among the network nodes that the transaction should be part of the blockchain. The network can be designed such that the oldest transactions with a large number of approvals are included in the blockchain at the earliest opportunity.

However, the potential reorganization of the blockchain poses a challenge to the finality of transactions. Despite a transaction having a high number of approvals, it is theoretically possible for it to be temporarily removed from the blockchain. This can occur if a portion of the chain (the most recent blocks of the blockchain) is replaced by an alternate version. The blocks in this new sub-chain version may contain a different set of transactions. Therefore, enhancing the finality of transactions involves altering the consensus to prevent blockchain reorganization, or modifying the rules to reflect the number of transaction approvals.

In the image below you can see the unwanted reorganization of the blockchain. TX1 in the upper chain had 4 approvals and one confirmation (BLOCK 2). Nevertheless, in the end, the lower chain in which there is no TX 1 transaction prevailed.

The transaction finality can be enhanced by nodes concurring on its integration into the blockchain during its network propagation. This is feasible due to the UTxO model, as the validation of a transaction merely requires the verification of the existence of input UTxOs (their presence in the UTxO set), signifying they are spendable.

Below in the article, we will show how it is theoretically possible to achieve faster finality of transactions before the network produces the next block.

Additionally, finality can be augmented by expediting the consensus on the newly appended block within the network. This can be achieved by nodes casting votes on a block either before its addition to the blockchain or shortly thereafter.

Input Endorsers

Input Endorsers is a plan to improve the scalability of Cardano. This improvement can be seen as a form of voting before a block (consensus block) is added to the blockchain.

The Input Endorsers use 3 types of blocks: Input Blocks (IB), Endorsement Blocks (EB), and Ranking Blocks (RB). Each type of block will be minted at a different frequency. Some blocks may be minted in the same slot. Input Blocks will be minted with the fastest frequency.

Input Endorsers keep track of all submitted transactions and bundle these transactions into pre-constructed blocks. The main purpose of the Input Endorser feature is to separate transaction selection from block production.

In this model, transactions could collect consents on their way to mem-pools. This is because Input Endorsers incorporate elements of parallelism and concurrency into the consensus while maintaining the linearity of the blockchain.

The Input Endorsers algorithm will facilitate the preparation of a large number of transactions during the transaction selection phase, as it allows for the concurrent participation of more nodes in the network. Endorser nodes validate and pre-approve transactions, which aids in diminishing the computational burden and bandwidth demands of the ranking block producers, while also enhancing the diversity and availability of the transaction pool. Moreover, they expedite the confirmation of transactions that have yet to be incorporated into the chain by supplying verifiable proof of endorsement.

The deployment of Input Endorsers is made feasible solely due to the UTxO model, as it allows nodes to validate transactions independently. The sole linearity that needs to be preserved in the system is at the level of ranking blocks, given that a blockchain is a linear sequence of blocks. The sequence of transactions within a block, inclusive of ranking blocks, is inconsequential. Interestingly, ranking blocks do not house transactions, but merely contain references to them via endorsement blocks.

The image depicts users submitting transactions that refer to input UTxOs (illustrated as red UTxOs) from the active set of UTxOs (the global state). These transactions undergo parallel validation and are subsequently incorporated into Input Blocks. Nodes can endorse these Input Blocks concurrently. The system maintains linearity and the utmost level of validation, which safeguards against double-spend attacks and the like, at the highest level of the consensus where Ranking Blocks are attached to the blockchain.

With each newly appended block to the blockchain, the global state also changes. As described above, consumed UTxOs are removed from the UTxO set, and new UTxOs (green UTxOs from transactions) are simultaneously inserted into it.

Note: The image above only shows the consumption of UTxOs, not the insertion of new UTxOs into the UTxO set.

Protection Against Double-Spent Attack

Protection against double-spending attacks is one of the factors that limit the scalability and parallel processing of transactions. The input for validating transactions is the global state. Node validates transactions in the context of the current global state that it maintains.

The accounting model fundamentally affects how nodes work with the global state and what options they have regarding the parallelization of transaction validation and scalability.

In the context of the UTxO model, a double-spend attack can occur when a user submits two transactions that reference the same input UTxO. This is akin to trying to spend the same money twice. A sophisticated attacker can submit transactions to the network through different nodes, i.e. in different locations of the network. This means that both transactions will be considered valid by a certain group of nodes until another transaction arrives spending the same input UTxO. That's when a conflict occurs.

In an ideal scenario, a double-spend attack is identified at the earliest opportunity. One transaction will be incorporated into the new block, while the other will be dismissed. The transaction that is included will influence the global state. The consumed UTxO will become unspendable, rendering the second transaction permanently invalid as it refers to an already spent UTxO.

With the implementation of Input Endorsers, a double-spend attack can potentially be detected at the stage of Input Blocks or Endorsement Blocks. Even if this detection does not occur, the attack will assuredly be identified by the node during the construction of the Ranking Block.

The image illustrates an attempted double-spend attack. Transaction TX 1 is placed into a different Input Block compared to transaction TX 2. Subsequently, both transactions are included in separate Endorsement Blocks. At this stage, the network can potentially identify the conflicting transactions and discard one of them. However, if this does not occur, the conflict will be resolved during the construction of the Ranking Block. The node responsible for building the Ranking Block possesses all the necessary information to detect conflicting transactions and resolve the issue.

Observe that the system facilitates the parallel processing of transactions while maintaining the ability to deterministically settle conflicts and thwart double-spend attacks. Most transactions are incorporated into the Ranking Block, but transactions in conflict stand no chance of passing validation.

Envision an improvement where the finality of transactions is not contingent on the number of confirmations at the Block Ranking level, but rather on the count of endorsements. This implies that finality would be roughly proportional to the velocity of transaction propagation within the network. The node that the user utilized to submit the transaction to the network must become aware of the number of endorsements. This necessitates a mechanism for disseminating endorsement blocks. The node that initiated the transaction will receive the Endorsement Block and will observe the number of approvals for the user’s transaction within it. This information can be displayed to the user via the wallet or through the blockchain explorer. If the transaction garnered approvals from, hypothetically, 51% of the stake and it was nearly certain that there was no conflicting transaction, the transaction could be considered final in theory. However, the transaction would only attain ultimate finality once it becomes a part of the Ranking Block.

The UTxO Model And Sharding

Sharding is a key technology that enhances the scalability of blockchains. The blockchain network is divided into smaller parts or sub-networks, known as 'shards'. The total computational and storage workload is divided into shards.

Sharding is a form of parallelism in a system. Each shard operates independently and can process its own set of transactions. This parallel execution of transactions increases the overall system efficiency and throughput.

Each shard maintains its portion of the global state. This is known as the local state. The global state of the entire blockchain is then the aggregate of all these local states.

Cross-shard communication is an important aspect of sharding. It refers to the process of transactions or information being exchanged between different shards. This is crucial because the throughput of a sharded blockchain network is significantly affected by cross-shard communication.

The UTxO model is more advantageous for sharding than the account-based model.

In the context of the UTxO model, the transaction serves as the fundamental unit of validation. A transaction is an entity that can be validated independently by multiple parties concurrently. The outcome of these validations will invariably be identical. Cardano executes transaction validations deterministically. Multiple parties can validate transactions in parallel and any order, irrespective of the block, meaning regardless of when the transactions will be incorporated into the block and regardless of the sequence of transactions within the block.

Contrastingly, in Ethereum’s account-based model, the block is the primary unit of validation. Transactions are interdependent. An Ethereum transaction cannot be validated in isolation, but only concerning the prevailing global state. The global state in Ethereum can be conceptualized as a sequence of transactions. The result of transaction validation is contingent on its position within this sequence.

The disparity in accounting models significantly influences the implementation of sharding.

To put it simply, consider the Cardano network as a single shard. If we had 10 shards, it would be akin to having 10 Cardano networks operating side by side. Theoretically, this could lead to a tenfold increase in network throughput. However, in practice, this might not be the case due to factors such as cross-shard communication, the need to ensure data consistency and security, network latency, and management overhead (like balancing the load between shards, managing nodes joining and leaving, and routing transactions).

For blockchains that use the UTxO model, implementing sharding is relatively straightforward, as the global state naturally allows for parallelism and transactions can be validated independently within each shard.

The UTxO model has its advantages when it comes to cross-shard communication, as it’s relatively easy to create a transaction that requires validation from two shards.

Imagine a transaction that has input UTxOs from the global state of SHARD 1 and creates output UTxOs in the global state of SHARD 2. This transaction would be part of the blockchain in both shards. The advantage here is that validation can occur regardless of the block, i.e., irrespective of the transaction’s order in the block.

In each shard with Input Endorsers, a cross-shard transaction could receive endorsements (approvals) shortly after its submission. The transaction would be submitted in SHARD 1, which would immediately forward it to SHARD 2. In both shards, the transaction would receive endorsements simultaneously. In both shards, a transaction could potentially gain support from, say, 51% of the stake in a very short time. There would need to be a mechanism in place to share endorsement blocks between shards (a subset of data related to cross-chain transactions would suffice). At the time when each of the shards would insert a transaction into the ranking block, the given shard could expect with a high level of certainty that the transaction would be inserted into the ranking block in the other shard, thanks to a high number of endorsements.

Theoretically, SHARD 1 does not have to wait for the transaction to be inserted into the Ranking Block. Of course, only under the assumption that there will be a mechanism (and incentives) that will ensure that a transaction with a high number of endorsements from the majority of the stake always gets into one of the next blocks.

In the picture, you can see transaction TX 1, which has an input UTxO from SHARD 1. The result of the transaction is an output UTxO that is created in SHARD 2. It can be said that the value is transferred between shards. The transaction gets to the Input Block in SHARD 1. SHARD 1 forwards the transaction to SHARD 2. TX one gets to the Input Block in SHARD 2. Transaction TX 1 gradually gets to the Endorsement Blocks in both shards and finally to the Ranking Blocks. As soon as the transaction is inserted into the Ranking Block, the local state in the given shard will be changed. The input UTxO is removed from the local state of SHARD 1. A new UTxO will be created in the local state of SHARD 2.

The UTxO model offers several benefits, including its applicability to both Input Endorsers and sharding. The aforementioned example illustrates the integration of these two technologies. Once Input Endorsers are in place, sharding has the potential to enhance Cardano’s scalability. However, Input Endorsers must expedite transaction finality to prevent sluggish cross-shard communication and ensure the integrity of the global state.

Why Is It Difficult To Implement Sharding In Ethereum?

Ethereum operates on an account-based model where each account has an associated state. The global state of Ethereum is essentially a database of all accounts and their current asset balances. With each new block, the system’s state updates based on the transactions within that block.

Transactions are processed by checking the sender’s balance for sufficient funds. If successful, the sender’s balance decreases while the recipient’s increases. The Ethereum Virtual Machine (EVM) then computes a new state from the current state and transaction, which becomes the basis for the next transaction. This continuous evolution of the blockchain state means transaction order is crucial for validation.

Each transaction can potentially alter any account’s state, and its outcome may depend on its processing order. To maintain consistency and prevent double-spending, the global state ‘locks’ during each validation, allowing only one transaction to process at a time. This locking mechanism is vital for system integrity, preventing simultaneous transactions from causing inconsistencies or double-spending.

Transaction inputs are account balances, a shared resource. This means the balance at transaction submission could differ from the balance at validation, making the transaction input non-deterministic, unlike the UTxO model.

In Ethereum, balances are enduring entities that can be modified by transactions at any time, but two transactions cannot simultaneously modify the same balance.

In the picture, you can see how users submit 5 transactions. Transactions can be submitted in parallel. Note that several transactions adjust the same balance. For example, TX 1 and TX 2 subtract a value from the same balance. Transactions TX 3 and TX 4 add value to the same balance. STATE 1 to STATE 4 represent the sequential processing of transactions and thus the gradual updating of the global state. The global state is made up of the same balances, i.e. permanent objects whose state can be changed many times in a row through transactions.

Imagine how much more complicated cross-shard communication is if a transaction has to simultaneously change one balance in SHARD 1 and another in SHARD 2. The order of transactions in the Ethereum block plays an important role in the validation result. With the current implementation of EVM, it would be necessary for both shards to include the transaction in the block and to pass information about the validation results to each other. It makes no sense that in SHARD 1 the transaction validation was successful and the value was deducted from the balance, while in SHARD 2 the transaction validation would fail, i.e. the value would not be credited to the balance. Cross-shard communication severely limits the ability to construct a block in each shard.

In the picture, you can see transaction TX 1, which is inserted into the mem-pool after submission. It is then inserted from the mem-pool into BLOCK 1. Validation of transaction TX 1 changes the current local state in SHARD 1. However, SHARD 1 needs to know the result of validation in SHARD 2. SHARD 1 therefore communicates with SHARD 2 about the result of the validation of the transaction in BLOCK 2 (SHARD 2). So SHARD 2 must insert transaction TX 1 into BLOCK 2 and perform validation. It can then communicate the validation result to SHARD 1. Note that SHARD 1 must know the validation result in SHARD 2 before adding another transaction to its BLOCK 1.

Indeed, the example provided is a simplified (naive) implementation of sharding. It highlights the significant challenges Ethereum’s team faces in implementing sharding with the current consensus mechanism.

Maintaining transaction atomicity during cross-shard communication in an account-based blockchain is a complex task. It’s vital to ensure that a transaction is either entirely executed or not at all to uphold the system’s integrity.

Various strategies and protocols, including two-phase commit protocols, have been proposed to tackle this issue. However, these solutions also present their trade-offs and complexities. It’s noteworthy that several projects have successfully implemented sharding within a blockchain using the account-based model.

Conclusion

Altering consensus mechanisms is just one method to enhance network scalability. Scalability can also be improved through second-layer solutions, layered architecture, or by boosting efficiency. For instance, Plutus V2 is more efficient than its predecessor, Plutus V1, resulting in block space savings. This efficiency allows for more transactions per block, thereby increasing network throughput. However, Input Endorsers can provide more substantial scalability improvements than efficiency gains alone.

Some members of the Cardano community believe that the focus should be more on enhancing scalability through ZK cryptography rather than Input Endorsers. Cardano will incorporate ZK cryptography through the Chang hard fork. Owing to Cardano’s determinism, it will be feasible to construct ZK rollups that don’t require sequencers.

The UTxO model is very well suited for Input Endorsers or sharding, as the global state composed of disposable objects is well parallelizable. Several nodes in the network can validate transactions independently of each other in parallel, even in the case of sharding.

Featured:

Related articles

Did you enjoy this article? Other great articles by the same author