Formal Verification of Halo2 Circuits in Lean

Security

•

July 2, 2025

Synopsis

‍

We are delighted to announce that we have completed our work on creating an infrastructure in the Lean proof assistant for formal verification of zk circuits written in Halo2, a zk framework used by major players in the ecosystem such as ZCash, Protocol Labs, and Scroll. This work was generously supported by the Ethereum Foundation grant entitled “Lean Extraction of Circuit Constraints from Halo2”.

‍

In particular, we have created an infrastructure called Halva, which allows us to extract constraints from the Halo2 circuit generation process into the Lean proof assistant and therein reason about circuit correctness. We have applied our infrastructure to the real-world zk implementation of the Keccak-256 hash function used by Scroll, in which we have found a critical bug that would allow a circuit using the Keccak-256 circuit to arbitrarily claim the Keccak256 output. Luckily, this bug could not have been exploited given that Scroll’s proving process is centralised. Moreover, the circuit in question has in the meantime been deprecated and the current Scroll Keccak-256 zk implementation does not suffer from the same bug. Nonetheless, we firmly believe that this work and its findings further underline the paramount importance of formal verification for real-world zk circuits, especially given that we are in an ecosystem that is slowly but certainly moving towards decentralised proving.

‍

Read on for more details about our new Halo2 verification infrastructure and the Keccak-256 bug that we have discovered, check out our other completed and in-progress work, such as:

‍

our ongoing EF-funded grant on creating an infrastructure for formal verification of Plonky3-based zk circuits;
our ongoing EF-funded grant on working towards a mechanised proof of the batched soundness of the FRI protocol in Lean, which can then be used to verify soundness of various zk proof systems;
our work on verifying correctness of cutting-edge RISC-V zkVMs, such as the RISC-Zero's R0VM and Succinct's SP1;
our verification of the honesty of ZKSync’s on-chain verifier, which was a first-of-its-kind achievement;
our verification of the correctness of the INTMAX2 protocol;
‍

and do not hesitate to contact us if any of our work could be applied to your infrastructure!

‍

Halo2 Lean Verification Infrastructure

‍

Our infrastructure for extracting Halo2 constraints into the Lean proof assistant, dubbed Halva, is available here. What we do, in essence, is that we hook into the Halo2 circuit generation process and use the information present therein to produce a Lean model of the circuit constraints. In particular, we strip away all of the Rust-related logic in order to be able to verify precisely what gets generated, bringing together all of the gate-, copy-, permutation-, and lookup-related constraints into a single Lean file. For the moment, we do not extract any information about the witness generator, but may do so in the future once the Lean back-end of the HAX Rust-reasoning framework comes out.

‍

Importantly, the extraction process allows us to maintain various properties of the circuit to be symbolic rather than concrete—such as, for example, circuit length and public inputs—in turn enabling general reasoning about how all of the possible values of these parameter affect circuit behaviour. We stop short of extending this symbolic-ness to arbitrary configuration parameters, however, as that would require full reasoning about Rust programs in Lean.

‍

The structure of the repository, in more detail, is as follows:

‍

src/field.rs: defines TermField, which is our symbolic field that can be used to allow field parameters to show up in the formal model;
src/collecting_assignment.rs: contains the base class for a prover which stores all of the assignments made during circuit construction and synthesis;
src/delegating_prover: defines the trait that describing which aspects of a halo2 circuit need to be able to be printed into the desired format;
src/lean_delegating_prover: contains a specific implementation of a delegating prover that outputs a Lean model; and
src/scroll contains the code that we verified, taken from Scroll’s codebase, and with TermField substituted in.

‍

Verification of simple examples

‍

To start us off, we have verified a series of simple examples, as follows:

Fibonacci: a basic copy- and gate-based example computing the Fibonacci function;
RangeCheck: a gate-based range check;
Scroll/BatchedIsZero: a simple Scroll gadget that decides whether a set of values is all zeroes; and
Scroll/BinaryNumber: a simple Scroll gadget that decodes a binary number.

‍

Verification of the Scroll Keccak-256 circuit

‍

How does Keccak-256 work?

‍

We start with giving a high-level description of the Keccak-256 function. This function, like the entire Keccak family, is underpinned by the sponge construction, which comes with several beneficial properties, such as resistance to generic length-extension attacks.

‍

The architectural underpinning of Keccak-256, and the entire Keccak family, is the sponge construction. A sponge function, conceptually, operates by "absorbing" input data into an internal state and then "squeezing" output data from this same state.

‍

When it comes to the absorption phase, there exists a rate parameter, which denotes the number of bytes that can be absorbed at once. The input, which is of arbitrary length, is extended with a padding bitstring (which can be thought of as a null terminator) to round the length up to the nearest multiple of the rate. Interestingly, if the input is already a multiple of the rate or one bit short of a multiple, the padding is done to the next multiple of the rate, as the minimum Keccak-256 padding length is two bits.

‍

At the start, the sponge, which is 1600 bytes long, is set to all zeros. Then, we perform the absorption phase, in which, for each rate bytes in the padded input, we:

‍

XOR the bytes into the start of the sponge;
apply a permutation function (which, for Keccak, is the composition of five smaller permutation functions) to the entire sponge.

‍

Once this is done, the squeeze phase simply takes the first 256 bits of the sponge, which constitute the output of the Keccak-256 function. In general, squeeze phases of generic sponge constructions can and do iterate further to obtain outputs of arbitrary size.

‍

What does the Scroll Keccak-256 zk circuit look like?

‍

To show real-world applicability, we applied our verification infrastructure to the Scroll Keccak-256 zk circuit. This circuit is parametric on the length of the input string, in that the constraints themselves stay the same, but the length of the trace varies. In particular, the constraints deal with rate bytes at a time, then check whether there are more bytes left in the current bitstring or whether to move on to potentially computing another Keccak-256 of a different bitstring. For simplicity, we have verified a circuit that is long enough to handle a single instance of processing rate bytes.

‍

Some of the relevant columns of this circuit are as follows:

‍

one that is (meant to be) constrained to hold the padded input string;
one that calculates the RLC (Random Linear Combination) of the non-padded input string, which is probabilistically unique per input string, that is, collision-resistant;
one that marks whether a given repetition of the circuit is dealing with the last rate bytes of the input—we denote this column by is_final;
numerous columns that contain the sponge before and after each of the five permutation functions;
one which contains the final output string;
one which contains the RLC of the final output string; and
one which contains the input string length.

‍

What have we actually verified?

‍

The goal of our verification effort was to verify soundness of the Scroll Keccak-256 zk circuit, that is, that the circuit constraints imply the mathematical description of Keccak-256, which we have written ourselves in Lean and which is available here. In the end, we were able to prove the majority of the desired claims. The proof infrastructure can be found here, and is organised, at a high level, as follows:

‍

Gates: contains hand-written simplifications of each gate in the circuit;
Lookups: converts each set of lookups into a transformation function;
Spec: Program.lean contains the main Rust-to-Lean translation, which is facilitated by the rest of the files;
Soundness: contains the proofs working towards the full “constraints implies specification” proof—most statements prove that the constraints imply various parts of the Keccak-256 process, whereas the Sequencing folder brings the reasoning together; and
ProgramProofs: assuming the constraints, proves all of the assertions declared in Spec.

‍

In particular, the main claim about the correctness of the absorb and squeeze phases is squeeze_absorb_eval, located here. In summary, we verified that:

‍

✅ If the appropriate column contains a padded input string, then the output string column will contain the result of applying the absorb and squeeze phases as per the mathematical description.

‍

as well as that:

‍

✅ If the is_final column is set only at the appropriate point, and the length of the input is small enough to fit in a single repetition of the circuit, then the padding is applied correctly and the input RLC is correctly calculated from the unpadded input string.

‍

We were, however, unable to prove that the is_final column is indeed set only at the appropriate point, as the generated constraints appeared insufficient. Exploring further, we have understood that if is_final is set (in addition to at the end) on an arbitrary row that is a multiple of 8 bytes with respect to the input string, then:

‍

the length and RLC calculations must restart from that row, but will have been calculated correctly for the prefix as described so far;
the output and output RLC will still be calculated correctly for the full input string;
as the output string is much shorter than the input, the calculation of the RLC does not use the full column, leaving the output RLC cell on the intermediate row where is_final was set free to be set to any value we choose (in particular, the RLC of any output string).

‍

This effectively means that:

‍

⛔ We can claim that an arbitrary input string (the length of which is divisible by 8), when passed to Keccak-256, produces an arbitrary output of our choice.

‍

To convince ourselves further of this bug, we have produced a malicious witness generator that is able to create a trace that satisfies the constraints but does not describe the expected output, that is, that allows the SuperCircuit to claim the result of Keccak-256 to be arbitrary. This witness generator can be found here.

‍

How severe is this bug?

‍

This bug is, in principle, fairly severe. However, there exist a few mitigating factors. Firstly, this circuit is no longer used in production, and its next version does not suffer from the same bug. Secondly, as Scroll’s prover is centralised, it is sufficient to be convinced that this prover alone does not exploit this bug, and this was Scroll’s position when we contacted them about it. However, we are not aware that of any security audits or formal verification efforts on the Scroll prover that would have established that this is indeed the case. Moreover, we feel that is important to highlight that this approach to tolerance of potential bugs and exploits in the context of a centralised prover can be extended to removing the need for constraints entirely, which we would find to be controversial at best.