# Verifying Poseidon in Clean: Why the Last 'sorry' Is About Primality

- **Authors**: Martin Ochoa
- **Date**: May 04, 2026
- **Tags**: formal-verification, zk, poseidon, lean, clean

![poseidon-clean](https://blog.zksecurity.xyz/posts/poseidon-clean/cover.jpg)

Poseidon is one of the most popular hash functions of the zero-knowledge ecosystem. It was designed as an [arithmetization-friendly hash function for SNARKs, STARKs, and related proof systems](https://eprint.iacr.org/2019/458), and it appears in widely used libraries such as [circomlib's Poseidon circuit](https://github.com/iden3/circomlib/blob/master/circuits/poseidon.circom). It is used in Merkle trees, commitments, nullifiers, and many other places where a proof system needs a hash that is cheap to express as arithmetic constraints.

In this post, we walk through a Lean 4 proof of correctness for a [Clean](https://blog.zksecurity.xyz/posts/clean/) model of circomlib's optimized Poseidon circuit at arity 1. The theorem says that the modeled constraints are sound and complete with respect to the optimized Poseidon specification used by circomlib:

```lean
def circuit : FormalCircuit (F BN254_PRIME) field field where
  Assumptions _ := True
  Spec input output := output = poseidon1Opt input
  soundness := ...
  completeness := ...
```

In other words, if the constraints are satisfied, the output is the specified Poseidon hash. And for every input, the expected witness exists.

> [!INFO]
> The proof and specs live in the [`Clean` repository](https://github.com/Verified-zkEVM/clean):
>
> - Circuit and proofs: [`Clean/Circomlib/Poseidon.lean`](https://github.com/Verified-zkEVM/clean/blob/main/Clean/Circomlib/Poseidon.lean)
> - Textbook spec: [`Clean/Specs/Poseidon.lean`](https://github.com/Verified-zkEVM/clean/blob/main/Clean/Specs/Poseidon.lean)
> - Optimized spec: [`Clean/Specs/PoseidonOptimized.lean`](https://github.com/Verified-zkEVM/clean/blob/main/Clean/Specs/PoseidonOptimized.lean)
> - Companion walkthrough: [`Poseidon.md`](https://github.com/Verified-zkEVM/clean/blob/main/Clean/Circomlib/Poseidon.md) 

If you are familiar with Lean, you know that large proofs are often developed top-down. You state helper lemmas first, use them to shape the main theorem, and fill in their proofs later. Lean marks these unfinished proof obligations with the keyword `sorry`. After weeks of work on the Poseidon proof, there was one `sorry` left:

```lean
instance : Fact (Nat.Prime BN254_PRIME) := by sorry
```

Interestingly, this was not just a missing crypto argument. It was the primality proof for the BN254 scalar field modulus. This number is a standard parameter of the BN254 curve (also known as `alt_bn128`), used in [Ethereum's pairing precompiles](https://eips.ethereum.org/EIPS/eip-197) and many zkSNARK libraries. But Lean does not accept "standard parameter" as a proof. To complete the theorem, we still have to prove that this 254-bit integer is prime.

You have probably heard of probabilistic primality tests such as Miller-Rabin. Those are useful engineering tools: run enough rounds, and a false positive becomes astronomically unlikely. But high confidence would also not constitute a Lean proof. Here we need a proof object that the kernel can check.

Closing this `sorry` therefore requires a [*primality certificate*](https://en.wikipedia.org/wiki/Primality_certificate): a finite witness that reduces the claim *"`BN254_PRIME` is prime"* to computations and smaller primality claims. Surprisingly, as we will see in the following, the certificate we need is based on a theorem from 1876.

## Why Poseidon Is Worth Verifying

Most familiar hash functions, like SHA-256, are bit-oriented. They use rotations, XORs, bitwise Boolean operations, and additions modulo powers of two. These operations are natural on CPUs, but expensive inside arithmetic circuits.

Poseidon was designed for the opposite setting. Its core operations are field additions, field multiplications, S-boxes of the form $x \mapsto x^5$, and small matrix multiplications. These are the kinds of operations that R1CS, PLONKish arithmetizations, and many SNARK circuits handle efficiently.

Efficiency is a major reason behind why Poseidon appears so often in circom circuits. But it also means that a bug in the circuit is serious. If the circuit accepts a witness whose output is not the intended hash, then every protocol depending on that hash can inherit the bug. This is precisely the kind of mistake formal verification is designed to rule out. Instead of testing many inputs and hoping the wiring is right, we state mathematically what the circuit should compute and prove that every satisfying assignment computes it.

## Standard Poseidon vs. Optimized Poseidon

Poseidon is usually described as a sequence of full rounds and partial rounds using an MDS matrix. That is the clean mathematical description: apply S-boxes, add round constants, mix the state, and repeat. The circomlib circuit however, does not implement this description in the most literal way. It implements an optimized schedule that is algebraically intended to produce the same permutation, but with fewer constraints. The optimization uses a pre-sparse matrix before the partial rounds and sparse matrices during the partial-round phase. 

For verification, this distinction matters. If the circuit checks the optimized schedule, then the theorem should target the optimized schedule. Proving the circuit against a textbook-looking spec directly would force the proof to also absorb the optimization argument.

So the proof uses two Lean specs:

```text
poseidon1     -- simpler, non-optimized, full-MDS output spec
poseidon1Opt  -- optimized circomlib-style schedule
```

You can find these definitions in [`Specs/Poseidon.lean`](https://github.com/Verified-zkEVM/clean/blob/main/Clean/Specs/Poseidon.lean) and [`Specs/PoseidonOptimized.lean`](https://github.com/Verified-zkEVM/clean/blob/main/Clean/Specs/PoseidonOptimized.lean) respectively.

The final circuit theorem is stated against `poseidon1Opt`, because that is the structure the Clean circuit model actually follows. The simpler `poseidon1` spec is still useful: we test `poseidon1Opt` against it and against circomlibjs vectors as an output-level sanity check.

## What We Proved

Currently, the proof is about arity 1 Poseidon. In circomlib terminology, this means one input element, plus the initial zero element, so the internal state has width $t = 2$. The formally verified object is a Clean circuit model of the optimized circomlib schedule. The top-level theorem proves two properties:

**Soundness.** If an assignment satisfies all constraints of the Clean circuit, then the output is `poseidon1Opt input`.

**Completeness.** For every input, the honest computation produces a witness satisfying all constraints.

Together, these say that the Clean circuit computes exactly the optimized Poseidon function. No extra outputs are accepted, and no correct computation is rejected.

There are also two trust boundaries worth stating explicitly:

- First, the proof does not mechanically parse `poseidon.circom`. The Clean circuit is a direct hand model of the same optimized arity-1 structure. The correspondence is small and explicit, but it is not a verified compiler pipeline from Circom source to R1CS.
- Second, the optimized Poseidon spec is tested against the simpler non-optimized spec and against circomlibjs vectors, but the equality `poseidon1Opt = poseidon1` is not yet proved for all field elements. This is a useful validation layer, not a formal equivalence proof.

With those boundaries made explicit, we gain strong confidence in the soundness and completeness of the circomlib Poseidon circuit: the Lean proof gives a machine-checked guarantee for the Clean circuit model, while the remaining assumptions are small, inspectable modeling links rather than hundreds of algebraic constraints checked by hand.

## The Circuit Shape

The optimized arity-1 circuit follows this pipeline:

```text
[0, input]
    |
ARK C[0..1]
    |
3 full rounds
    |
transition round with P
    |
56 sparse partial rounds
    |
3 full rounds
    |
final SBOX + MDS
    |
state[0]
```

This is the optimized schedule, so the Lean specification used by the final theorem is:

```lean
Specs.PoseidonOptimized.poseidon1Opt
```

## Proving Small Pieces First

Trying to prove the whole circuit at once would be difficult: arity-1 Poseidon already has more than 400 intermediate witness variables, and the 56 partial rounds are enough to make naive unfolding painful for Lean. Instead, the proof follows a natural circuit decomposition strategy.

The proof has three layers of `FormalCircuit`s:

```text
Sigma                                       -- x ↦ x^5
FullRoundOpt_t2 / PartialRoundOptStep_t2    -- one round
ApplyFullRounds / ApplyPartialRoundsOpt     -- a phase of N rounds
```

Only `Sigma` is its own bottom-level circuit: it introduces witnesses for $x^2$, $x^4$, and $x^5$, and the soundness proof just rewrites the constraints and closes with ring arithmetic. The ARK and matrix arithmetic is inlined directly into the round-level circuits, since it's cheap enough that factoring it out would obscure the proof more than it would shorten it.

The round-level circuits are then built around `Sigma`:

```text
FullRoundOpt_t2:        Sigma on both state elements, then ARK, then Mix
PartialRoundOptStep_t2: Sigma on the first element, ARK on the first element, then MixS
```

This gives the proof a modular shape. If a component is wrong, the local theorem for that component fails. If the component is correct, the next layer can use its specification instead of redoing the algebra every time.

## The Hard Part: Composition

Once the round-level components are sound and complete, the challenge is to glue 56 partial rounds (and two full-round phases of 3 each) into one end-to-end proof. The naive approach is to write the whole pipeline as one expression and let Lean's simplifier unfold every fold. That does not scale: the resulting term is enormous, elaboration becomes slow and brittle, and the proof script becomes unreadable.

The technique we use is to **wrap each loop in its own `FormalCircuit`**. `ApplyPartialRoundsOpt` is a `FormalCircuit` whose `main` is a `Circuit.foldl` over the 56 per-round circuits. Its `Spec` says simply: "the output equals `partialRoundsOpt_t2 C_t2 S_t2 56 10 0 input`". `ApplyFullRounds` does the same for the 3-round full-round phases. Each loop becomes a self-contained gadget with its own soundness proof.

The benefit of this encapsulation is that the loop is hidden behind its `Spec`. The top-level Poseidon circuit composes a handful of these gadgets sequentially; it never has to see the underlying fold, and the composition between phases reduces to chaining equalities between their specs.

Inside `ApplyPartialRoundsOpt.soundness`, we still need an induction to go from "each partial round produces the spec's next state" to "after 56 rounds, the state matches `partialRoundsOpt_t2`". This is handled by a private induction lemma, together with an `envState` helper that recovers the state at each iteration from the witness environment. The induction is local to the loop circuit and is never re-run at the top level.

With the loops encapsulated this way, the top-level `Poseidon1` proof is relatively short. After `circuit_proof_start` unpacks the constraint hypotheses, a focused `simp` unfolds each phase's `Spec`, and a final projection extracts state element 0. The linear composition of subcircuit specs *is* the proof.

## What the Final Theorem Says

The final circuit is:

```lean
def circuit : FormalCircuit (F BN254_PRIME) field field where
  main := main
  localLength _ := 402
  Assumptions _ := True
  Spec input output :=
    output = Specs.PoseidonOptimized.poseidon1Opt input
  soundness := ...
  completeness := ...
```

The `localLength` value accounts for every witness variable:

```text
initial ARK:        2   (InitialArk)
first full rounds: 24   (3 × 8 in ApplyFullRounds)
transition:         8   (FullRoundOpt_t2 with P)
partial rounds:   336   (56 × 6 in ApplyPartialRoundsOpt)
last full rounds:  24   (3 × 8 in ApplyFullRounds)
final round:        8   (FullRoundOpt_t2)
----------------------
total:            402
```

The soundness theorem starts from an arbitrary environment satisfying the constraints and ends with:

```lean
Expression.eval env ((main input_var).output offset)
  = Specs.PoseidonOptimized.poseidon1Opt input
```

The completeness theorem goes in the other direction: for any input, the circuit's honest witness generation satisfies the constraints. This is the core result. The Clean model of optimized arity-1 Poseidon computes exactly the optimized Poseidon specification.

## The `sorry` in the Room

All of this takes place over the BN254 scalar field:

$$
p =
21888242871839275222246405745257275088548364400416034343698204186575808495617.
$$

Lean needs to know that this number is prime. As discussed before, it is not enough to show it is a probable prime, such as in "Miller-Rabin passed on many bases", but prime as a theorem:

```lean
Fact (Nat.Prime BN254_PRIME)
```

This matters because the proof uses field reasoning. If the modulus were composite, `ZMod p` would not be a field. Algebraic facts that are valid in a field would no longer be available.

So the last `sorry` is crucial, but it is also independent of Poseidon. It is a number-theoretic certificate for a fixed 254-bit integer.

## Lucas's Test

The classical Fermat test says that if $p$ is prime, then:

$$
a^{p-1} \equiv 1 \pmod p
$$

for every $a$ not divisible by $p$. The converse is notably false. In fact, a whole (infinite) family of numbers, the so-called [Carmichael numbers](https://en.wikipedia.org/wiki/Carmichael_number), can pass Fermat tests while being composite. 

Lucas's test adds one crucial condition. If there exists an $a$ such that:

1. $a^{n-1} \equiv 1 \pmod n$,
2. for every prime factor $q$ of $n-1$, $a^{(n-1)/q} \not\equiv 1 \pmod n$,

then $n$ is prime.

The intuition relates to the order of elements in a group. The first condition says the order of $a$ divides $n-1$. The second says it does not divide any of the smaller candidates obtained by removing a prime factor. Together, they force the order of $a$ to be exactly $n-1$, and that can only happen when $n$ is prime. Mathlib contains this theorem as `lucas_primality`.

## Pratt Certificates

Lucas's test seems to require knowing the prime factorization of $n-1$, including proofs that those factors are prime. Pratt's observation was one can use a recursive strategy in order to use Luca's test to proof primality. 

To prove that $n$ is prime, provide:

1. a Lucas witness $a$,
2. the factorization of $n-1$,
3. Pratt certificates for each prime factor.

The recursion stops at the number 2, as in the following example.

![Pratt certificate tree for 2063](https://blog.zksecurity.xyz/posts/poseidon-clean/pratt-2063.svg)

This gives a compact proof object that a proof assistant can check. The relatively expensive work of finding the factorization and witnesses can happen offline. Lean only has to verify the certificate.

For BN254, the top-level factorization of $p-1$ is:

$$
p - 1 =
2^{28}
\times 3
\times 13
\times 29
\times 983
\times 11003
\times 237073
\times 405928799
\times 1670836401704629
\times
13818066809895115352007386748515426880336692474882178609894547503.
$$

The largest factor is itself large, so it needs its own certificate tree. But the structure is the same all the way down.

The [Verified-zkEVM project](https://verified-zkevm.org/) maintains [CompPoly](https://github.com/Verified-zkEVM/CompPoly), a Lean library for computable polynomials, which already contains a machine-checked Pratt certificate for the BN254 scalar field prime in its [BN254 field formalization](https://github.com/Verified-zkEVM/CompPoly/blob/master/CompPoly/Fields/BN254.lean). That certificate can close the remaining `sorry` by providing the missing `Fact (Nat.Prime BN254_PRIME)` instance.

![Pratt certificate sketch for BN254](https://blog.zksecurity.xyz/posts/poseidon-clean/pratt-bn254.svg)

At the time of writing, this last `sorry` is still present in our Clean development. The missing ingredient is not a new number-theoretic argument, since the certificate already exists. The remaining work is engineering work: bumping Clean's Lean toolchain to the 4.29 release that CompPoly targets, so the certificate can be imported and the `sorry` discharged. This work is already in progress.

## What This Gives Us

After closing the primality certificate, the Lean proof establishes that the Clean arity-1 model is sound and complete for `poseidon1Opt`. The strongest remaining uncertainty is no longer inside the Lean proof. It is at the modeling boundary:

```text
poseidon.circom source
        |
   hand-modeled in Clean
        |
Lean proof of soundness/completeness
        |
optimized Poseidon spec
        |
test-vector agreement with non-optimized spec and circomlibjs
```

The result is useful precisely because of this shift. The question is no longer *"did we inspect 402 witness variables and dozens of round constants correctly?"* (something which would be very hard for a human reader to confidently answer) but *"does this Clean model match the Circom circuit it claims to model?"*, which an auditor can verify by reading both side by side.

## What's Next

The current proof covers arity 1. Circomlib Poseidon is also commonly used at arities 2, 3, and 4, for example in binary Merkle trees and wider hash inputs.

The specs and test vectors for those arities already exist. The next step is to extend the circuit proofs. The same pattern should carry over: a small S-box circuit, round-level gadgets with inlined arithmetic, and phase-level loop circuits. The harder part is again the composition problem at larger state widths.

There are also two natural ways to strengthen the statement:

1. prove the optimized and non-optimized Poseidon specs equivalent for all inputs,
2. mechanically connect the Clean model to the original Circom source or compiler output.

The current result is a useful milestone: a machine-checked proof that a realistic optimized ZK hash circuit model computes exactly the function it claims to compute, with the last missing assumption reduced to a classical primality certificate.

**Acknowledgements** We thank Gregor Mitscha-Baude, Yoichi Hirai, and David Tedgui for their help with the Clean proof and for valuable comments on this blog post.

---

This article was published on the [ZK/SEC Quarterly](https://blog.zksecurity.xyz) blog by [ZK Security](https://www.zksecurity.xyz), a leading security firm specialized in zero-knowledge proofs, MPC, FHE, and advanced cryptography. ZK Security has audited some of the most critical ZK systems in production, discovered vulnerabilities in major protocols including Aleo, Solana, and Halo2, and built open-source tools like [Clean](https://github.com/Verified-zkEVM/clean) for formally verified ZK circuits. For more articles, see the [full list of posts](https://blog.zksecurity.xyz/llms.txt).
