// inside head tag
The Ethereum rollup space has seen significant developments over the past two years. Validity rollups, also known as ZK-rollups, have emerged as highly competitive, practical, and inexpensive Layer 2 solutions in recent months. Moreover, with the upcoming Dencun hard fork and the implementation of data blobs (EIP-4844), understanding the processes and costs associated with verifying validity proofs on the Ethereum mainnet is crucial.
If you’re not yet familiar with rollups, we recommend reading this insightful post from the Matter Labs team.
Starknet and zkSync Era, in particular, have become prominent competitors, often matching or even exceeding the number of user operations executed on the Ethereum mainnet. Our goal is to accurately compare their systems in terms of transaction volumes, relative compute, and compression efficiency, delving into aspects such as Layer 1 verification costs, Layer 2 transactions, and resource consumption on Layer 2. They were chosen for being the largest Layer 1 gas spenders, with readily available analytics infrastructure; Linea and Scroll will be left to a future study.
The following sections will provide a detailed analysis of the costs associated with submitting proofs and state data on-chain. We will dissect the Layer 1 smart contracts responsible for this process and evaluate the average on-chain verification cost per transaction.
Additionally, we will project the impact of EIP-4844 on these costs, highlighting which chain might benefit more from this upgrade. Lastly, we will discuss various techniques for comparing the true scalability of the two networks, aiming for a balanced and nuanced comparison of their efficiencies.
Starknet processes transactions differently than mainstream EVM-compatible chains. In our later analysis, instead of analyzing Starknet’s transaction volume, we will focus on user operations as defined by bartek.eth. A user operation represents a ‘standard’ operation a user wants to execute, including token transfers, token swaps, mints, or vote casts.
On most EVM-compatible chains, each operation is sent as a single transaction. When account abstraction is available, users bundle operations into a single transaction. On Starknet, this is known as a multi-call. Due to this difference, direct ‘transaction denominated’ comparisons, such as transactions-per-second (TPS) comparisons, underrepresent the performance of rollups with native Account Abstraction.
For example, approximately 20% of all transactions on zkSync Era are ERC20 Approve transactions, while almost all Approve transactions are bundled together on Starknet. Effectively, the same set of users and actions on Starknet emit 20% fewer TPS.
When comparing transaction loads between Starknet and other chains, we utilize the ‘UserOperation’ metric (abbreviated as UOP or UOPS).
Currently, validity rollups are fully featured and competitive with the Ethereum mainnet in several ways. Ethereum averages ~550,000 daily contract call transactions (excluding native ETH transfers). Starknet’s volumes are more variable, yet its average daily user operations volume is slightly higher. zkSync Era paints a similar picture, characterized by high variability, with volumes frequently surpassing those of the Ethereum mainnet.
In the chart above, daily transactions on zkSync Era reached an all-time high on December 16th, due to massive volumes of BRC-20 inscriptions being minted.
We utilize the zkSync Era zks_getL1BatchDetails
RPC method to retrieve the mainnet transaction hashes for the commit, prove, and execute transactions. Using zks_getL1BatchBlockRange
we fetch the range of L2 blocks included in each batch.
At batch 330,981, zkSync Era underwent a prover upgrade, redeploying L1 verification contracts and moving from a Plonk-based proof system to their updated Boojum system. zkSync Era data is broken into two datasets: Plonky data for the batches before the transition and Boojum data for batches after the upgrade.
To trace proof batches for Starknet, we trace the verifyProofAndRegister
transactions on Ethereum. From these traces, we can examine the isValid(fact)
calls and generate a list of facts for each proof. Once the list is generated, we can directly match the component transactions to the batch they belonged to. For data availability (DA) transactions, verification facts are available through logs. In the subsequent step, we utilize state-diffs to detect which facts are logged by each transaction. The final step is to match state transition facts with updateState
events, which provide the L1 messaging costs and a list of L2 blocks for each batch.
The earliest Starknet batches utilize a different data format. For simplicity, our Starknet data omits the first ~300 batches. Similarly, the first few months of zkSync Era used a separate verification architecture; for simplicity’s sake, we omitted the first 100,000 batches. For Starknet, data between batches 368 and 3,966 is backfilled, and for zkSync Era, data is present between batches 100,000 and 401,345. The table below lists the date ranges of the backfilled proof batches, as well as the L2 block numbers that are covered within this dataset.
In this section, we detail the on-chain steps required for verifying zkSync and Starknet proofs. We dissect the associated costs, which include the on-chain submission of the proof, the major operations in the proof verification algorithm, and committing the rollup’s data for data availability. For reference, links to the on-chain contracts are included.
zkSync Lite and the early version of zkSync Era utilized a PLONK prover system. We expect these SNARK-based systems to have constant proof verification costs and variable data availability costs. In December 2023, Era upgraded to utilizing its Boojum Proof system, which utilizes a STARK proof wrapped inside a SNARK. This proof is still expected to have a fixed verification cost and a variable data availability cost, but more efficient proving that reduces centralized infrastructure costs. Unfortunately, no centralized data is available to analyze.
To first establish a baseline, Figure 1.1 shows a breakdown of the daily L1 verification costs accrued by zkSync Lite. For the zkSync Lite PLONK prover, blocks are first committed, then proved, and finally executed. Most interestingly, commitBlocks
uses the most gas, totaling 60% of the verification cost, while proveBlocks
and executeBlocks
each account for approximately 20%. The quality of L2 transaction data and analytics for zkSync Lite is not optimal. However, estimates based on block averages suggest that zkSync Lite transaction volumes rarely surpass 100K Layer 2 UOPS per day. For a deeper dive into zkSync Lite data, refer to this dashboard by Marcov.
⚠️ Changes from zkSync Lite to Era
1 — L1 ↔ L2 messaging operations were moved from executeBlocks to commitBlocks, reducing executeBlocks costs and increasing commitBlocks. (This allowed Era to create a ~20-hour execution delay, during which blocks are committed and proved, then remain in a pending state before being finalized. This delay creates a roll-back buffer in case of a security incident. This delay was used in December after record volumes from BRC-20 token inscriptions stress-tested the network.)
2 — zkSync Era moved to storing state-diffs instead of full transactions. Given the messaging changes, we would expect the proportion of commitBlocks to increase; however, it remains constant, suggesting that state-diffs require fewer DA bytes than full transactions.
The EOAs/Proxies responsible for submitting the transactions for both zkSync systems are as follows:
A zkSync batch consists of three separate operations: committing transactions/state-diffs, verifying proofs, and execution, which includes updating the chain head and messaging, among others. The L1 transactions for each step can be found in the zkSync Era Block Explorer and will be used to break the verification costs into separate operations. The Ethereum mainnet contracts processing these transactions are listed below.
All zkSync proof systems start by sending DA commitments for batches with a commitBatches
transaction. In zkSync Lite, a block is represented by its transactions, whereas in zkSync Era, it is represented by the state difference that results from executing the transactions. commitBatches
may contain multiple batches and consume/emit L1 ↔ L2 messages. Once DA commitments have been hashed and saved, a proof-bearing proveBatches
transaction is sent. Each batch has one proof. The verification of the proof checks that a DA commitment is saved and that the zkProof is correct. Several hours later, batches are ‘executed’, finalizing the state of the L2. EachexecuteBatches
transaction executes many batches at once.
An analysis of the cost breakdown between the Era Plonk and Boojum provers in Figure 1.3 reveals that, although the percentage plots do not indicate drastic changes, the total L1 gas consumption suggests that the Boojum upgrade was a significant improvement.
The verification cost per batch for Plonk verification was ~760,000 L1 gas while verifying Boojum proofs is only ~460,000 gas per batch, indicating a considerable improvement. Counterintuitively, the proportion of costs spent on proveBatches
does not show a significant reduction. However, a review of the total daily expenses in the top subplots reveals a clear decrease in total fees. Considering the increase in Layer 2 UOP volumes following the Boojum upgrade, it appears that zkSync has also made substantial optimizations to their DA posting during this upgrade.
Further analysis of how zkSync reduced their DA expenses while handling more transactions would be valuable.
Starknet utilizes a STARK-based proving system, which we expect to have a polylogarithmic verification cost.
All proof verification transactions originate from one of two EOAs. These EOAs call one of five entry point contracts for each step of the batch verification.
The table below showcases the five entry point contracts, along with an example transaction showing the operations executed. During proof verification, these contracts handle the various steps of proof verification. Please note that batches consist of hundreds of transactions sent over dozens of blocks.
Starknet blocks are batched together. The registerContinuousMemory
transaction submits the state-diffs of the blocks as data availability (DA), involving two transactions per block: one capturing the state-diff, and the other capturing the block outputs. Once a batch of state-diffs has been committed, the corresponding execution proof is verified. Finally, L1 ↔ L2 messaging is processed, and the head of the chain is advanced.
To break down the verification, we will separate it into the following:
Data availability through registerContinuousMemory
uses the most gas, accounting for approximately 80% of the verification cost.updateState
accounts for 10%, verifyProofAndRegister
for 5%, and the remainder uses 5%.
zkSync Era and Starknet use approximately similar amounts of daily gas, ranging between 2 and 4 billion. While Starknet uses significantly less gas for proof verification compared to Era and Boojum, it consumes more gas to make its data available. Although this may not be a fair comparison due to different network usage, these insights are valuable. For a more precise comparison, in section 2, we detail the verification cost per user operation.
The above data suggests that Starknet could benefit significantly from minimal improvements to its on-chain data representation or the implementation of EIP-4844, which might lead to it becoming cheaper than zkSync Era.
The previous analysis compared the total daily verification costs between networks and the breakdown of their internal expenditures. While this provides insight into the operating costs of the rollups, it does not help us understand the performance or cost of the network for end users. In this section, we compare the verification costs of the system per user operation, as detailed in the methodology section.
To achieve this, we will measure the number of user operations in each batch of L2 blocks and plot them against the verification costs of the batch on L1 for both Starknet and zkSync Era.
Once we have a formula for how the cost per transaction changes, we can plot the performance of both networks based on batch size and compare their average costs per user operation.
We define the userOperaptionCost
for a batch according to the number of L2 operations and the amount of L1 verification gas spent per batch.
In the following sections, we’ll plot L1 gas used against the number of user operations in the batch and aim to find a best-fit regression for the above formula. We then compare these between Starknet and zkSync.
In Figure 2.1, we plot the L2 batch user operation count against the gas required to verify the batch on L1, covering both zkSync Era (Plonky) and zkSync Era (Boojum).
We note a few things:
commitBlocks
appears to be constant with the number of transactions in Era pre-Boojum. This is not what we expect, as we assumed that larger transaction counts would result in larger state-diffs. However, this discrepancy can be explained by Era’s batching algorithm. The batcher tries to include as many transactions as possible, up to a gas limit or a hard limit of 750 transactions (Plonk) and 1,000 transactions (Boojum). Based on this assumption, batches with lower transaction counts likely used proportionally more storage. To address the selection bias in the batch size and batch cost, we will remove the cost of providing DA to ensure a fair comparison of the networks.💡The Negative Cost per Transaction regressions result from zkSync batch sizing and seal criteria. Batches are sealed and proved if any of the below conditions are met:
— MAX_PUBDATA of 120kb per batch is reached
— Transaction limit of 750 (Plonk) or 1000 (Boojum) is reached
— Gas limit of 80 million is reached
— zkEVM Prover circuit limit reached
More documentation about batch sizes in zkSync can be found here.
Notably, we have the following regressions for L1 Gas for L2 batch of x user operations:
Extrapolating from the trends in the Era data, the cost of verifying a batch decreases as the number of user operations increases. This result is surprising, counterintuitive, and likely a result of selection bias in the batch size. As x
goes to infinity, we have -299.18 gas per transaction. This is impossible, so we must conclude that our model is missing data, which would constrain the cost per transaction to 0 gas. The situation is even more absurd for Boojum.
On Starknet, the Merkle, FRI, and Proof verification are almost constant in relation to batch size. Register Memory and Update State operations both require calldata posting, and consequently grow linearly with the number of operations in the batch.
Notably, we have the following regression for L1 Gas for the L2 batch of x user operations:
On Starknet, the total cost of verifying the batch increases as the number of UOPS increases. Based on the current dataset, the average L1 cost per user operation is 2,425 gas, with a fixed cost of ~51 million gas per batch. At the median batch size of 30.7 thousand operations, evaluating the above regression results in a cost per operation of ~4,120 L1 gas per L2 user operation.
For context, Ethereum spends an average of 108 billion gas per day and records an average of one million daily transactions. This translates to an average of 108,000 gas per transaction. Therefore, Starknet results in a 44x decrease in transaction cost on average.
The first thing to note about Starknet is that it operates with significantly larger batch sizes than zkSync Era. While zkSync Era has transaction limits of 750 or 1,000 per batch, Starknet has no transaction limit; instead, it fills proofs to resource limits. Starknet batches can vary widely, ranging from primarily simple operations, containing more than 50,000 operations, to those with resource-intensive operations having fewer than 15,000 operations.
🗒️ From the above observation, Starknet shows an advantage in compressing huge quantities of simple operations. While zkSync Era batches would fill at 1000 transactions, Starknet batches would continue to be filled until resource limits were met.
Taking into account the batch sizes of each network, zkSync Era and Starknet spend a very similar amount of gas verifying a similar number of L2 operations. Figure 2.3 below plots the daily L1 cost per L2 operation, which is computed by dividing the L1 verification gas by the number of L2 operations.
From the above figure, it’s clear that the Plonky prover for zkSync Era showed a cost-per-operation very similar to that of Starknet. However, the recent Boojum upgrade resulted in substantial improvement, decreasing the cost per operation by about half, thus giving zkSync Era the current advantage in user operation compression.
Since EIP-4844 is expected to impact the dynamics of the rollups, it is valuable to analyze the costs of proof verification without the DA. In the case of zkSync Era, the proveBatches
costs remain fixed. On the other hand, for Starknet, the verifyMerkle
operations have a fixed cost per batch, while verifyFRI
and verifyProofAndRegister
operations have a polylogarithmic cost.
💡While VerifyFRI has polylogarithmic verification costs, the current prover only fills up proof to 128 leaves, requiring eight verifyFRI operations. If 256 leaves were filled, nine verifyFRI operations would be required. The two tiers of Starknet costs are from earlier proofs, which filled only 64 leaves, requiring only seven verifyFRI operations.
The bottom subplot of Figure 2.4 plots the fixed costs per operation on a log scale and evaluates the operation compression at the minimum, median, and maximum batch sizes. For zkSync Era, the median post-Boojum batch contains 807 operations, resulting in a fixed cost of 568.15 gas per UOP. In contrast, the median batch size on Starknet is significantly higher, at 30,700 user operations, leading to verification costs of 432.35 gas per user operation.
From this graph, it appears that zkSync Era has a compression advantage, as the cost per UOP is shifted approximately two orders of magnitude to the left compared to Starknet. However, in practice, zkSync Era batches are much smaller, and when considering the effective costs per UOP for each batch size, Starknet currently maintains the advantage in compression if DA and messaging costs are disregarded.
The concrete execution cost on a validity rollup is difficult to compare and contrast. For instance, what does it mean to rewrite something in Cairo? How expensive would it be to run on-chain compared to the Solidity implementation? Will both EVMs consume the same amount of gas for a set of opcodes? These questions evoke the age-old computer science debate: Which programming language is the fastest?
Although this question lacks a definitive answer, there are typically two approaches to answering it. Firstly, one could benchmark a variety of algorithms and their implementations across different languages and runtimes, and compile the data to compare the performance of a class of algorithms as exemplified by the Benchmarks Game. Alternatively, the second approach involves implementing the program in question across various environments and directly measuring the performance of these implementations.
In this section, we’re not focusing on which network is the “fastest”. Instead, our aim is to compare programs based on the unit of compute as tracked by networks — gas.
The following analysis revolves around the observation that despite each network featuring novel programs, differing implementations of the same programs, and different usage levels, the overall pattern of the profile of user operations remains fairly consistent.
This allows us to fit their usage profiles together and compute functions that, on average, convert the gas units on one network to the gas units on another. We will attempt to approximate a formula that compares the gas consumption on each network with Ethereum’s gas metrics.
Our result is unlikely to be a good formula, especially as the networks diverge in user behavior. However, we believe it offers a novel way to compare the networks’ computing capabilities, providing developers with a “finger in the wind” technique to gauge potential locations of their next deployment. We hope the technique inspires others to construct better, more nuanced methods.
To compare the makeup of transactions between each network, we backfilled a week of transactions for Ethereum, zkSync Era, and Starknet. All data collected falls between September 16th and September 23rd, with a Unix timestamp between 1694847600
and 1695452400
. The block ranges for each network are listed below.
The table below showcases the data on Resource consumption of transactions over this period. gas_used
is analyzed between all three networks. It is important to remember that gas is constructed fundamentally differently between networks, and cannot be directly compared.
To create a more accurate baseline for transaction distribution, we excluded native mainnet ETH transfers (accounting for 32.26% of transactions) as Starknet and zkSync Era do not have native ETH transfers, and such transfers comprise a fraction of smart contract traffic on Ethereum. Most applications exclusively use ERC20 tokens, and WETH transfers are still captured in this data.
The table below shows the unfiltered Ethereum gas usage data. However, in the subsequent sections, we will exclude native transfers from all Ethereum distributions.
💡The Starknet distribution is much “spikier” because gas usage remains consistent across all executions of the same operations. In zkSync Era, gas usage for identical transactions varies depending on L1 gas prices, leading to a smoother distribution. For Ethereum, the smoothing is likely due to the wide variety of protocols and different implementations of each UOP.
The distribution across each network is most concentrated toward the lower end, which aligns with the expectation that the majority of transactions involve basic transfers, ERC operations, and DeFi transactions. This is corroborated by analyzing the transaction makeup on each network.
The makeup of transaction types across networks is very similar, predominantly involving transfers and swaps on DEXes. Although these rollups facilitate cost-effective on-chain deployment for a new class of applications, the bulk of the observed activity results from the more affordable deployment of applications already existing on Ethereum.
These graphs do not indicate a trend of applications using significantly more compute than before.
We assume that the distributions originate from the same “global” distribution of user intents. If this assumption holds, we would expect the Kolmogorov–Smirnov (KS) test to be zero. However, the resources of the networks are measured in different units. We use the expectation that KS should be 0 to normalize the units by defining a function over the resource usage histogram and minimizing the output of the KS test. Effectively, we treat the KS test as an objective function to be minimized. Our methodology is detailed comprehensively in Appendix B.
The result is an approximate mapping from zkSync gas to Ethereum gas and from Starknet gas to Ethereum gas. This allows us to compare transaction costs between the networks and lays the groundwork for tackling comparisons of compute/second.
We obtain coefficient outputs for the conversion functions after minimizing the variance between the two histograms. Plotting the converted densities next to the Ethereum KDE not only reveals the similarities between distributions, but also highlights potential areas where the fit might be improved. This is a thoroughly interesting exercise in data science, and we encourage everyone to explore the repository for this research and contribute to improving our method.
Now that we have classified and understood the Layer 1 verification costs, the next step in computing profitability is to examine rollup revenue. Similar to pre-EIP-1559 Ethereum, rollups pay their operating costs with transaction fees, which are typically denominated in ETH.
Gas on Ethereum serves as the primary mechanism to thwart DoS attacks. It does so by assigning the cost of opcodes in proportion to the load they put on the network. Basic operations like ADD()
are cheap, consuming only 3 gas, whereas expensive operations, such as database lookups, can consume 20,000 gas. A well-designed gas schedule also reduces the variation in maximum execution times. This is because each block has a gas limit, and the consumption is closely correlated to execution complexity.
This principle also applies to Layer 2 networks, where provers and sequencers have constrained resources. The fees paid for a transaction must be proportional to the complexity of executing transactions, sending data-availability commitments, generating proofs, verifying proofs, and the future price of L1 gas. This results in a complex equation ensuring the fees paid by rollup users cover the infrastructure costs while keeping fees low to improve UX.
Optimistic rollups typically feature EVM implementations that closely mirror those of the Ethereum mainnet, resulting in nearly identical gas mechanics. This includes copying Opcode prices exactly, as seen with Optimism and Base. On the other hand, ZK-rollups must adjust additional operations since all instructions must be run through a zkProver. In particular, computing commonly used hash algorithms in a ZK environment is computationally intensive, and opcode prices are adjusted accordingly.
These changes have intriguing design implications. One notable example is mappings, which retrieve values from keys using hashes. ERC tokens extensively utilize mappings; however, the increased cost of hashing renders standard ERC token implementations less efficient on ZK-rollups compared to those on optimistic rollups. Starknet also uses ~252-bit words (felts) instead of the 256-bit words found in EVM-compatible chains, further reducing the efficiency of standard implementations. Consequently, optimized implementations of standards and protocols may vary significantly between ZK-rollups and Optimistic rollups/Ethereum mainnet.
Starknet uses a novel Virtual Machine that is not EVM compatible. However, the concept of gas remains identical, with fees assigned proportionally to the resources consumed. Each Starknet transaction utilizes different execution resources, such as “steps”, “builtins”, and “memory”. There is a weighting formula over the execution resources, ensuring that the L2 gas models both the proof generation complexity and the L1 verification cost. Finally, the Sequencer is fed a weighted average of L1 gas prices. For more detailed information on this mechanism, refer to the Starknet Book, and AVNU’s blog post: A Tale of Gas and Approximation.
As seen in Figure 4.1, Starknet prices are highly correlated to Ethereum gas prices. This close tracking of the gas prices makes Starknet transactions predictable and understandable for end users. A fee market is on the roadmap for Starknet, which will cause these graphs to diverge. While the fee market will allow users to specify their priority fees, it is likely that L2 gas prices will always have a lower bound determined by a function of the L1 gas price.
On zkSync Era, block base fees remain very steady, set at 0.25gwei for the majority of the lifetime of the Plonk prover, and adjusted every few days with the Boojum prover (currently around 0.10gwei). To offset the fluctuating prices of L1 gas, OpCodes that directly consume L1 gas are given a dynamic gas cost. For example, the gas used by an SSTORE
Opcode on zkSync Era changes each block to reflect the ETH fee for an SSTORE
on Ethereum.
Arithmetic and logical operations impact only proof generation and do not linearly increase verification costs. Due to this compression, these opcodes consume a fixed quantity of gas.
Figure 6.2 showcases the dynamic resource pricing. It shows daily median gas consumption for zkSync Era ERC20 approve()
transactions on the left axis and plots the L1 gas price on the right axis. On the Ethereum mainnet, approve()
transactions consume a fixed ~46,000 gas, whereas on zkSync Era, each approve()
transaction consumes several hundred thousand gas. Notably, around May 5th, 2023, the gas consumed for an ERC20 approve()
on zkSync Era exceeded 3 million.
While it may seem counterintuitive to users familiar with the Ethereum gas structure, this mechanism allows zkSync Era to accurately model its L1 verification costs. Given that certain operations have a predictable L1 gas cost, the Era blockchain can set opcode prices for each block according to L1 prices. For more information on zkSync fee mechanics, read their docs.
The fees paid by users represent the rollup net income. In Figure 6.3, we plot the percentage breakdown of where those fees are spent (excluding centralized off-chain costs). Starknet fees occasionally exceed 100%, which is not an error but rather indicates an unprofitable day for Starknet.
In comparing zkSync Era and Starknet, it becomes evident that zkSync generates proportionally higher profits from the net fees collected. However, their expenses are also distributed quite differently. On Starknet, approximately 10% of fees are spent on proof costs, while zkSync expends around 25% of its fees on proof costs.
💡 The above data shows that decreases in data availability (DA) costs will disproportionately impact Starknet and zkSync Era. Since the percentage of costs spent on DA is much higher on Starknet than on zkSync Era, we anticipate that fee reductions from EIP-4844 will have a greater impact on Starknet than on zkSync Era.
While zkSync Era currently boasts a better overall compression ratio, the above data and the fixed costs depicted in Figure 2.4 lead us to believe that this may change after EIP-4844 goes live.
Now that we have a better understanding of how fees are generated on rollups, we plan to backfill the fee data for both Starknet and zkSync Era. We can estimate the rollup profit (which does not account for off-chain costs) by taking the L2 fees generated and subtracting the L1 verification costs for each batch.
The profitability of the rollups remains an open question. In the upcoming section, we’ll proceed under the assumption that they are indeed profitable. In addition, we call on rollup operators to publish data about their operational costs. This information is crucial for their communities since the longevity of the networks, the degree to which they can decentralize, and their design priorities are partly dependant on the operational profitability. Note that the cost of computing proofs varies greatly depending on the proof system chosen, and including this data may radically alter the observations below. With this in mind, we urge caution when referring to this data in other contexts.
The regression lines applied to the profit data illustrate the trends in both daily and cumulative profitability, with the coefficients for these regressions detailed in the legends. Starknet’s average daily profitability is nearly constant, in contrast to zkSync Era’s daily profitability, which shows a downward trend. We can create an estimate of daily income from the Cumulative Profit regression coefficients: for Starknet, it stands at 11.94 ETH per day, and for zkSync Era, at 36.79 ETH per day.
In the bottom right subplot, we plot the profit per transaction. A simple observation shows a significantly higher variance in Starknet’s profit, whereas zkSync Era maintains a steady profit, averaging around 50,000 Gwei per L2 transaction. We presume that zkSync Era’s proof times and L2 fee structure allow for better modeling of L1 costs, resulting in a more even distribution of costs for users. However, we note that the fee design for both networks is evolving, and criticizing either network at this point may be premature.
ZK-rollups utilize proof systems instead of M of N consensus for integrity, typically requiring 1 of N honesty to benefit from the validity proof. At the time of writing, many designs are being discussed and developed to ensure acceptable guarantees around the rollups’ other properties, such as liveness and censorship resistance. Regardless of the mechanism used, determining the ‘enough’ number of validators for a rollup or any network to ensure all properties are guaranteed remains unclear. Nevertheless, we can attempt to estimate an upper bound for the number of validators that can be supported given the existing rollup profitability.
⚠️ This analysis should be approached with caution. The profitability of rollups is under question and the airdrop hunting season (at the time of writing) may bolster transaction volumes. Given that the networks are still in their early stages and experiencing significant growth, it is not clear whether the current profit margins will hold, grow, or shrink in the future.
First, we calculate the daily profits of Starknet and zkSync, which stand at 11.94 ETH and 36.79 ETH, respectively. We then compute the annual rewards for each, assuming validators are proportionately rewarded.
Doing some rough calculations, assuming an ETH price of $2000 USD and 100 validators, each Starknet validator would gross approximately $7,200 per month, while zkSync Era validators would gross approximately $22,000 per month. This is enough profit to rent 2–3TB of RAM and 100+ CPU cores through EC2. Therefore, a network size of at least 100 validators is likely sustainable for both Starknet and zkSync Era.
Let’s start with the **(admittedly generous) assumption that Ethereum’s per-validator revenue is a good indicator of a validator’s price. Ethereum validators are expected to gross ~1.4 ETH each year, excluding MEV profits. The table below attempts to quantify the number of validators each network could have if it allocated the same revenue to each validator as Ethereum does. It’s important to note that many networks are considering operating with a magnitude of hundreds of validators, which could significantly increase profitability per validator.
Note: The data for Optimistic rollups was pulled from Dune analytics; however, we have not backfilled or double-checked the Dune metrics.
While these numbers don’t account for Layer 1 MEV revenue or the higher hardware costs for operating rollup validators, they clearly show that rollups are generating (at the time of writing) enough profit to attract a significant number of operators, aligning with the current ambitions of the rollups. This calculus may change as the number of validators, users and transactions changes. However, it’s reasonable to expect that having hundreds of validators could be both possible and suitable under the current circumstances. We emphasize again the immense value that would come from current rollup operators publishing their server and prover costs. Such transparency would help validators understand the operational requirements and increase the pace of decentralization.
After reviewing the verification cost breakdowns between zkSync Era and Starknet, it becomes clear that data availability is currently the primary cost to validity rollups. Each non-zero byte of calldata costs 16 gas (with zero bytes costing 4 gas), and rollups post volumes ranging from several dozen to hundreds of megabytes daily. The total daily calldata consumption across major rollups frequently surpasses 600 MB, occasionally reaching peaks of over 1GB. This extensive data posting contributes to L1 bloat, and increases the cost of L2 operations. In the following section, we will analyze the effect of EIP-4844 on rollup costs.
Each blob consumes a fixed amount of blob_gas
, with each block having a target of 3 blobs (0.375MB) and a maximum of 6 blobs (0.75MB). Blob gas fees are calculated using the formula blob_fee_wei = blob_gas * blob_gas_fee
.
For each block, the following fee mechanisms are in place. Should multiple blocks in a row consume more than the target blob count, the blob gas price will rise exponentially. The UPDATE_FRACTION
parameter is set such that the maximum increase in blob_gas_price
between two blocks is 12.5%. The excess_blob_gas
will be an accumulator that tracks the gas prices.
excess_blob_gas = parent.excess_blob_gas + parent.blob_gas_used - TARGET_BLOB_GAS
blob_gas_price = 1 * e**(excess_blob_gas / UPDATE_FRACTION)
+-----------------+-----------------------------------------+
| Number of blobs | Change in blob_gas_price for next block |
+-----------------+-----------------------------------------+
| 0 | -11.1% |
| 1 | -7.55% |
| 2 | -3.85% |
| 3 | No change |
| 4 | +4.00% |
| 5 | +8.17% |
| 6 | +12.5% |
+-----------------+-----------------------------------------+
Rollups are currently incurring costs of about 0.2 to 0.6 ETH for every 128KB of calldata. With a target of 3 blobs per block, the DA capacity for Ethereum will be approximately 2.7GB per day, nearly doubling the peak DA load observed in January 2024. Since the blob supply is fixed, the price of blob gas is set by demand. In the initial weeks following the Dencun hard fork, blob gas will be negligible until the usage of EIP-4844 accelerates and the 2.7GB supply is fully utilized. During this period, the cost for rollup DA will be 26 orders of magnitude lower than its current value.
The graph below shows the daily average fee paid for mainnet calldata for each of the rollups. Calldata posts are broken into 128Kb chunks (the size of a blob) to approximate an upper bound for blob gas prices. Given the current fee of approximately 0.075ETH per 128kb, this equates to a blob gas price of around 60 gwei. We hypothesize that if blob gas prices were to rapidly reach these levels, rollups might decide to continue posting calldata instead of blobs, particularly considering the permanent persistence as opposed to the two-week lifetime.
One unexpected observation from the data presented in Figure 5.2 is the variability in fees per kB. Even Base and Optimism, which are built on identical tech stacks, have different fees.
Two factors can contribute to this discrepancy: 0x00
bytes, and fluctuating gas prices. Non-zero bytes of calldata use 16 gas, whereas zero bytes only use 4 gas. Since both zero and non-zero bytes are accounted for, if network A sends very sparse DA commitments with many zeroes, while network B sends very dense DA commitments with few zeroes, network A will have a lower gas consumption per kB compared to network B.
The other factor that could influence this scenario is demand pricing. If network A had a policy of sending commitments only after the previous two blocks used less gas than the target, it would likely end up avoiding peak traffic times and paying lower fees on average. This strategy would decrease the consistency of DA commitments, and we doubt that it is the root cause of this variability.
While the difference in gas costs between dense and sparse datasets is likely the primary factor behind the variability, delving deeper into this analysis could be interesting.
Examining verification costs provides valuable insights into the operations of rollups. While compression and scaling are frequently addressed topics, assessing the realized costs and throughput offers tangible insight into the efficiency of Starknet and zkSync Era. This exercise should be performed for all rollups.
Our analysis was effective due to the similar profiles of user operation usage across both networks. As long as this condition persists, UOPS/second serves as a reliable metric for comparing networks. However, this begs the question: will rollups execute fundamentally different types of computation in the future? If rollups begin to specialize, UOPS/second will gradually lose accuracy. Comparing execution resources per second would provide a more accurate metric.
Additionally, we have taken the initial steps toward constructing fair comparisons by creating a function to convert between execution resources. We hope our efforts encourage more investigation into fair comparison using similar techniques. We believe that, once a baseline metric for comparing execution resources across domains is established, it could replace transactions-per-second (TPS) as the standard metric.
We performed a high-level analysis of the current L1 expenses and L2 revenues, which proved to be a valuable exercise for identifying trends and patterns. However, it does not provide insights into the profitability of validity rollups. Once more, we emphasize the value of public data. Should rollups disclose their centralized infrastructure costs and architectural overviews, it would enable researchers, developers, and operators to gain a clearer understanding of the operational requirements for different rollups, thereby accelerating the pace of decentralization.
In its current state, EIP-4844 blobs will persist on the beacon chain for only two weeks. Although this may be adequate for most cases, we suspect that some rollups may aim to increase the lifetime of their data availability (DA) commitments, switching to alternative solutions. Moreover, the supply of EIP-4844 is fixed, and with the demand for blobs on the rise, their price will continue to rise. In contrast, on dedicated DA networks, supply can expand in response to rising demand, providing dedicated DA solutions with a price advantage in the long term. It remains to be seen whether EIP-4844 will become the standard for rollup DA or if dedicated networks will prevail over time.
Data for this research was collected from the JSON-RPC endpoints of the Layer 2s, with Layer 1 data being backfilled from a combination of JSON-RPC endpoints and the Etherscan API. Additionally, some Starknet data was backfilled using Voyager.
For the backfilling process from RPC endpoints and APIs, the python-eth-amm library was used.
The instructions for recreating the data used in this report, as well as the associated notebooks can be found in the L2Bits repository.
Transactions from the following addresses were backfilled to compute the L1 verification costs associated with each network. While this data was available through Dune analytics, it was manually backfilled to ABI decode transactions, ensuring full control over the underlying data.
To compare the makeup of transactions between each network, we backfilled a week of transactions for Ethereum, zkSync Era, and Starknet. All data collected spans from September 16th to September 23rd, with a Unix timestamp between 1694847600
and 1695452400
. The block ranges for each network are listed below.
Starknet Proofs are clearly delineated, with batches containing several hundreds of transactions broken up into separate verification steps. To compute the Starknet batches, we selected all L1 transactions in order and indexed through them until we detected the start of a new proof batch. Additionally, we collected L2 blocks and matched the L1VerificationHash
parameter to the L1 batches to precisely calculate which blocks were proved in each batch.
For zkSync Era and zkSync Lite, the Operator EOA regularly sends executeBlocks
, commitBlocks
, and proveBlocks
, without breaks in between. Due to this frequency, we avoided any logic-based batching and instead grouped L1 and L2 blocks into hourly bins. Although the batch delineation is not as clear as in the Starknet case, oversampling should not be a significant issue due to the regularity of the L1 proof verification.
This research highlighted a lack of mature indexing and analytics solutions for new rollups. For example, data from Dune analytics and Subgraph is not available for Starknet. zkSync Era was recently added to Dune analytics in October 2023, which represents notable progress and has made many areas of this research significantly easier.
Collecting raw data from RPCs proved more challenging than expected, with RPCs behind Cloudflare gateways. There were rate limits, timeout errors and non-responsive host errors. If you’re interested in analyzing these networks, consider exploring CLI tool included in python-eth-amm.
Starknet RPCs are much better due to the fact that there are three open-source nodes available, and both Juno and Pathfinder provide snapshots that allow for node setup in less than an hour. Additionally, the block explorers for Starknet are more complete than those for zkSync Era and Lite, making sleuthing much easier.
Decoding data for Starknet is significantly more challenging due to its different nature and the less developed tooling. In this research, we were able to avoid most of these challenges by utilizing the indexing infrastructure from the Voyager block explorer. However, recreating Starknet parsing and decoding at home is currently quite difficult.
Extrapolating from here, the transaction data for each rollup will be a derivation of user intent. Some chains have lower fees and weaker security, and might have higher expressions of low-capital applications, while some chains may have computationally intensive user intents filtered out by economics. Working from these assumptions, it would be logical to conclude that the transaction makeups across chains are drawn from similar user intent distributions.
The first step is to exclude outliers between rollups. For Ethereum, we will exclude all ETH transfers, since they constitute a significant portion of the volume but are not part of the smart contract ecosystem. Even with ETH transfers excluded, the DeFi transfers are still captured, given that WETH is significantly more prevalent.
From these filtered datasets, we compute the densities, which describe the proportion of the transactions at each resource consumption level. We then compute the Kernel Density Estimate (KDE) for each network. Using the KDE, we can estimate the number of transactions expected at a given price level and generate new random distributions following its estimates.
The next step involves finding a method to compare the similarity of the two distributions. The Kolmogorov–Smirnov (KS) test is an ideal choice in this context, as a 2-sample KS test generates outputs that can be interpreted as the percent chance that two datasets were drawn from the same distribution. To learn the fundamentals of this test, read more here.
However, the KS test has one significant downside: it is sensitive, and if distributions have too much variance, the outputs will be zero, leaving the optimizer without a signal on whether the variance between distributions is increasing or decreasing.
Due to this sensitivity, we need to manually adjust the parameters of the function until the distributions begin to look somewhat similar, and the KS statistic is less than 0.80. Once we reach this point, the optimization algorithm can discern whether changes are increasing or decreasing the accuracy, allowing the optimizer to take over from there.