Zero-knowledge Virtual Machines (zkVMs) leverage zero-knowledge proofs (ZKPs) to verify the correctness of computations executed on a specific instruction set architecture. In practical terms, zkVMs allow you to write programs in familiar high-level languages — such as Rust or C — without having to deal with the details of ZKPs. By abstracting these complexities, a secure zkVM can generate and verify proofs for any application.
In this post, we give a brief refresher on zkVMs and discuss several projects that have significantly influenced their design and evolution. We want to emphasize that the list of projects discussed here is neither exhaustive, nor do we go into too much technical detail just yet. Instead, this post marks Part 1 of an entire series on the landscape of zkVMs and their technical particularities. For an introduction centered around security, we highly recommend you also take a look at our blog post on zkVM Security.
Now, without further ado, let’s dive right in, shall we?
A Brief Refresher on zkVMs
Blockchain protocols such as Ethereum are often described as an “internet computer” — a computer everyone can use by installing their own applications and using each other’s applications. Today’s internet computers have a problem, though: to ensure the computer hasn’t made a mistake, every user is forced to reexecute everything. Sounds like a pretty clunky piece of tech, doesn’t it? But fear not. This is about to change!
The recent progress around ZKPs allows us to enhance these internet computers with computational integrity. In other words, only the computer has to compute. You and I can lean back and trust the result, thanks to cryptography!
This is where zkVMs come into play. But before we get ahead of ourselves, let’s briefly recall the lowest level of development for general-purpose ZKP systems: arithmetic circuits.
Arithmetic circuits are an intermediate representation that represent an actual circuit, but using math, so that we can prove it using a proof system. In general, you can follow these steps to make use of a general-purpose ZKP system:
- take a program you like
- compile it into an (arithmetic) circuit
- execute your circuit in the same way you’d execute your program (while recording the state of the memory at each step)
- use your proof system to prove that execution
What does an arithmetic circuit really look like? Well, it looks like a circuit! It has gates and wires, although its gates are not the typical circuit ones like AND, OR, NAND, XOR, etc. Instead, it has arithmetic gates: a gate to multiply two inputs, and a gate to add two inputs.
Source: Real-World Cryptography
Okay, with that out of the way, let’s talk about virtual machines (VMs). A VM can usually be broken down into three components:
- Some memory that can be used to store and read values. (E.g., if you add two values together, where do you read the two values from? And where do you store the result?)
- A set of instructions that people can use to form programs.
- Some logic that can interpret these instructions.
In other words, a VM looks very much like this:
for instruction in program:
parse_instruction_and_do_something(instruction)
This loop is typically referred to as the fetch-decode-execute cycle because each instruction is fetched from memory, decoded, and then executed. For example, using the instructions supported by the Ethereum VM (EVM), you can write the following program that makes use of a stack to add two numbers:
PUSH1 5 // will push 5 on the stack
PUSH1 1 // will push 1 on the stack
ADD // will remove the two values from the stack and push 6
POP // will remove 6 from the stack
Alright, so what about zkVMs, then?
From the outside, a zkVM is very similar to a VM: it executes programs and returns their outputs. Developers still have access to the same set of instructions, which, like most VMs, is usually just the base for a nicer higher-level language (which can compile down to those instructions).
BUT: a zkVM also returns a cryptographic proof of the program execution’s correctness that users can verify!
Looking inside a zkVM reveals some arithmetic circuits — the ones we’ve talked about above, remember? And these arithmetic circuits “simply” implement our VM loop:
for instruction in program:
parse_instruction_and_do_something(instruction)
The reason we put “simply” in double quotes here is because it’s far from simple to implement this loop in practice. But that’s the basic idea behind zkVMs.
Lastly, it’s worth noting that there exist other approaches to computational integrity that are closely related but are not to be confused with zkVMs. For example:
- Aleo VM — an intermediary representation where a compiler later translates the VM instructions into an actual circuit.
- zkLLVM — targets high-level programming languages by compiling LLVM’s intermediate representation directly into a circuit.
- Lurk — proves an abstract state machine that evaluates a program directly, by successively attempting to reduce it following a simple set of rules, bypassing any compilation phase.
But more on these in a future post of this series… (⌐■_■)
Okay, now that we have a good grasp of what zkVMs are, let’s take a closer look at some of the projects that got us where we are today!
The OG zkVM
First published in 2013, Succinct Non-Interactive Zero Knowledge for a von Neumann Architecture by Ben-Sasson, Chiesa, Tromer, and Virza was the first paper to deliver a practical construction of a zkVM. As the title suggests, the machine described in the paper follows a von Neumann architecture, i.e., both the program and the data are stored in the same memory. Since most modern CPUs are based on this paradigm, any program that can run on a classical computer can, in theory, also run on this architecture.
The paper introduced its own RISC architecture called vnTinyRAM and presents a C compiler ported to target this instruction set. The proof system is designed to verify the correctness of program execution up to a fixed number of steps, with the underlying idea being that the circuit is constructed as a repeated state transition function, unrolled until the instruction count limit is reached. Notably, this results in a bounded zkVM, where the proof only covers a predetermined number of cycles. Many people assume that zkVMs are inherently unlimited, but vnTinyRAM’s design imposes a fixed cycle limit, contrasting with more modern designs like Cairo, which we’ll discuss in greater depth in the next section. Like many other modern zkVMs, Cairo leverages STARKs, which can dynamically scale with the number of cycles required, effectively enabling an unbounded zkVM. But we’re getting ahead of ourselves…
The creation of vnTinyRAM set a strong precedent in the field, showing that it’s possible to securely and efficiently verify real-world computations using ZKPs. For the sake of completeness (pun intended), we should also mention that later analysis uncovered some oversights in vnTinyRAM’s SNARK. While those small mistakes actually had severe security implications, it was possible to quickly fix them in subsequent revisions. To read up on these issues and their resolution, we encourage you to check out the following two reports:
- A Note on the Unsoundness of vnTinyRAM’s SNARK
- On the security of the BCTV Pinocchio zk-SNARK variant
Von Neumann for STARKs
Eight years after the introduction of vnTinyRAM, Cairo marked the next significant milestone in modern zkVM development by introducing the first efficient and practical von Neumann architecture that can be used with the STARK proof system to generate proofs of computational integrity. The name Cairo comes from “CPU AIR,” where AIR (Arithmetic Intermediate Representation) refers to the arithmetization used for STARKs. In other words, Cairo is an AIR that is implementing the concept of a CPU.
Some notes:
- If you’re new to the concept of arithmetization, our 1-hour arithmetization course will get you up to speed.
- In case you’re interested in formal methods, you might also want to check out StarkWare’s formal analysis of their fetch-decode-execute logic using Lean 3.
- Bobbin Threadbare’s Distaff is a STARK-based VM that was published even earlier than Cairo. However, it never made it to production-ready status. Still, it is an interesting piece of zkVM history. For more details, head over to the original post on EthResearch and the Distaff repo.
Cairo features a unique memory model that leverages permutations and lookup techniques to enforce custom constraints similar to approaches seen in lookup protocols like plookup.
Source: Cairo whitepaper
By using so-called builtins (predefined operations implemented directly as sets of equations), Cairo was the first project to introduce the idea of precompiles in zkVMs. These builtins allow Cairo to further minimize the overhead typically involved in translating common operations, such as hash functions, into polynomial constraints. Moreover, it should be noted that Cairo uses a custom Instruction Set Architecture (ISA) that is specifically engineered for the zk-proving context, setting it apart from other approaches like RISC-V-based systems and zkEVMs. (More on that in a second.)
Lastly, recall that Cairo’s design fully leverages the flexibility of STARKs to support an unbounded zkVM, where the circuit can dynamically scale with the number of cycles required for a program to terminate. This stands in contrast to bounded systems like the aforementioned vnTinyRAM, which unroll their execution trace only up to a fixed instruction count.
If you’re looking to learn more about Cairo, we highly recommend you take a look at the relevant chapters of our STARK book.
Integrating RISC-V
The RISC Zero zkVM marked another important milestone in the evolution of zkVMs as it was the first ever to integrate the RISC-V architecture. Its design builds on a three-circuit architecture: two STARK-based circuits and one SNARK circuit. More specifically, a given program is first executed, generating a number of “segments,” where each segment is then proven using a STARK. Subsequently, all segment proofs are aggregated into a single proof, also using a STARK. Lastly, the resulting proof is shrunken using a SNARK, making verification more efficient.
This approach, often referred to as continuations, offers several benefits: it allows for parallelizing the proving process by distributing segments across multiple machines, enables pausing and resuming of proof generation, and keeps memory requirements fixed regardless of the overall program length. With continuations, programs can run for as long as needed while maintaining efficient proof generation.
Source: RISC Zero Study Club’s presentation on continuations
Moreover, RISC Zero has optimized its design for GPU proving and integrated precompiles for common operations like SHA-256 and 256-bit modular multiplication, improving performance without altering the underlying proof system. Lastly, the RISC Zero zkVM has broad compatibility in the Rust ecosystem, ensuring that many existing Rust crates work with the system without additional configuration.
Optimizing zkVMs with Lookup Arguments
Jolt (Just One Lookup Table) and its companion paper Lasso, which were published together in 2023, introduced a new front-end technique for zkVMs that leverages lookup arguments to streamline the representation of program execution. While Lasso focuses on introducing a new family of lookup arguments, Jolt details how Lasso can be used to build a RISC-V zkVM. Rather than generating a circuit that both selects and executes each instruction step-by-step, Jolt replaces the execution step with a single lookup into a massive, pre-computed table. For example, for a RISC-V instruction that operates on two 64-bit inputs, one could imagine a table with $2^{128}$ entries. However, thanks to its structured design and decomposability, this table is never fully materialized, and its properties allow for efficient handling.
What makes Jolt/Lasso’s result particularly impressive is that it achieves Barry Whitehat’s vision of the lookup singularity that he published towards the end of 2022. The goal of this vision was to transform arbitrary computer programs into circuits that only perform lookups in order to improve performance, auditability, and formal verification of the correctness of the front-end.
Speaking of auditability: One of Jolt’s main selling points is the simplicity of the codebase. While other zkVMs typically require multiple tens of thousands of LOC, Jolt is implemented in less than 25,000 lines of Rust.
For a deeper dive into the Jolt zkVM, be sure to check out our other blog posts on the topic:
- A Challenge on the Jolt zkVM
- Improving the Security of the Jolt zkVM
- A Technical Dive into Jolt: The RISC-V zkVM
Figure 2 of the Jolt paper: Proving the correctness of CPU execution using offline memory checking and lookups
Other Influential zkVM Projects
The zkVM landscape is rapidly evolving, with many noteworthy projects worth exploring. While we’ll cover them in future posts, here’s a non-exhaustive list in no particular order for you to check them out in the meantime.
- Valida - Based on Plonky3. Implements a custom ISA and integrates with the LLVM toolchain.
- Scroll - zkEVM powering the Scroll zk-Rollup.
- SP1 - Based on Plonky3. Implements a RISC-V-like ISA.
- Nexus - Strong focus on parallelization. Implements a RISC-V-like ISA.
- zkMIPS - Based on Plonky2 and the MIPS microarchitecture.
- zkWASM - zkVM that supports Web Assembly.
- EraVM - zkEVM powering the ZKsync protocol.
- Miden - zkVM based on STARK proofs.
- Ceno - Strong focus on segmentation and parallelization.
- zkMove - Computational integrity for Move smart contracts.
- o1VM - MIPS zkVM grafted onto the Kimchi proof system.
- Linea - zkEVM powering the Linea layer 2 network.
Ongoing Work at zkSecurity
At zkSecurity, we’re currently developing a formal verification framework for zkVMs using Lean 4 as part of Ethereum’s zk(E)VM Formal Verification Project. Our framework, clean, is an embedded Lean DSL for writing zk circuits targeting AIR arithmetization. We intend to build out clean into a universal zk framework that can target all arithmetizations and produce formally verified, bug-free circuits for the entire ecosystem. For more details, be sure to check out our clean GitHub repo.
Special thanks to François Garillot, Ventali Tan, and Jerry Kwok for feedback and review.