An Opinionated Overview of ZK Tooling and Proof Systems Right Now

Sep 1, 2023 crypto zk

When entering the ZK space, it’s easy to be overwhelmed. ZK provides succinctness, verifiability, and privacy, but it’s unclear where different concepts lie on these three axes. Everyone is shilling their own protocol, and there are a ton of different proving standards and papers coming out every day. Folks often have no grounding to even start thinking about different ideas and protocols security, efficiency, and tradeoff wise. Unfortunately, its very hard to quickly distinguish what is worth investing into, when precise security guarantees or undisclosed “gotchas” are unclear. I will summarize how I am personally currently thinking about the space of ZK tech, especially as we make decisions for what to prioritize for our own code and protocols. I am not perfectly versed in all of the tradeoffs of all of the recent ideas, but this will be a live doc updated as I read and learn more, and folks comment corrections.

This is NOT an indictment of the ideas/protocols I don’t highlight or cover favorably, nor does this represent the opinions of anyone I cite or credit (they are my interpretations only). I am aiming to make an intellectually honest survey, and so if I misunderstand something, please tell me (telegram, twitter) – I am very open to continual changes and improvements, especially as the space and this tech rapidly evolves. You can leave comments on this hackmd. Thanks to Nalin, John, Yi, Richard, Sora, Ratan, ShuklaAyush, and Vivek for thoughts on this post, in addition to the countless folks behind these protocols themselves, and folks who I’ve had conversations with regarding zk over the last 2 years! Thanks to Richard and Sachin for touching on many of these points in their ZK Summit London talk as well.

Last updated Jun 20, 2024.

ZK Proving Languages and Stacks

Two benchmarks for server-based ZK proof stacks are Celer’s benchmarking and Modulus Labs’ graphs. There is extremely high variance in how much they were able to optimize each circuit to the quirks of the individual proving systems, but they give a decent grounding to start looking at tradeoffs. I’ve summarized some here as well, and provide a conclusion at the end.

Circom: This is the language that has historically been used by the main zk apps in production, including Dark Forest and Tornado Cash. It is commonly used because it has the fastest browser proving time due to optimized WASM proofs in the browser, super-fast server side proving via rapidsnark which is only about 10% slower than gnark, and extremely small and fast on-chain verification (8 uint256s verified in about ~300K gas). The language is relatively easy to pick up and experiment with (i.e. at zkrepl.dev) and has a good developer community that has contributed a significant number of circuits. There’s a number of unofficial, unaudited backends including Nova, a very slow PLONK, FFLONK which is like PLONK but with a 10x slower prover and a 33x cheaper verifier, and even [unaudited, untested, unmaintained] STARKs. zkrepl.dev makes development more accessible.
Halo2: This is the library originally developed by zcash for generating PLONKish ZK proofs. Unfortunately, though Axiom and EZKL and PSE used to maintain it, they now all have fairly divergent forks and not as much maintenance. For us, much of the speedup is due to PLONK supporting lookups. It’s built to support multiple backends KZG which is better for on-chain verification, and IPA – this makes the circuits future-proof in case the latest and greatest proving system changes, or lookup arguments get faster. It’s approachable to learn via Axiom’s crash course and 0xPARC’s class. When prover optimized (i.e. k under 13), our circuits have sub minute proofs in browser (about 10x faster than circom) with downloads < 500MB (5x smaller than circom), primarily driven by lookups saving constraints. To prove on chain, there are direct solidity verifiers for millions of gas on an L2, or GPU-accelerated (repo) server-side recursive provers with recursive verifiers that make arbitrary proofs only ~450K gas onchain. There are various circuits subparts that make new circuits easy to build, at Axiom’s halo2-lib and zk-email’s circuits. halo2repl.dev makes development more accessible, but doesn’t support external imports yet. On-chain verification can be done on L2s without recursion, or cheaper via recursive proofs on servers with GPUs. Even with recursion, still be faster than circom overall. Has a growing library of existing circuits as well as accessible lectures and exercises to learn.
PIL: Jordi’s second project after Circom was to build a STARK proving language optimized around VM computations. Due to the way they define certain operations and primitives, it’s only really suitable for circuits that are VMs right now. They are verified on chain via recursive PIL STARKs, and PIL2 STARKs in circom.
plonky2: Plonky2 is quite fast on baremetal. Unfortunately, you do take a risk since the security of Deep-FRI hasn’t yet been formally proven in an academic paper – Fiat-Shamir was just proved, but novel ideas like grinding have no academic proof yet. We should expect browser-based implementations to have 4-8x slowdown because WASM does not get the main vectorized speed gains (which mostly come from the Goldilocks field size being small enough to fit elements in single vectorized registers). Unfortunately, vectorization for in-browser WASM is still pending standardization. Rumor is that plonky2 potentially has some fast closed-source browser implementations, but this cannot be verified or benchmarked yet. However, on baremetal macbook it completes pretty large proofs (1000 poseidon hashes so ~300K constraints) in less than a second, so perhaps this slowdown isn’t that bad. Proofs are closer to 250kB, which is a no-go for on-chain proofs, though Succinct is working on a WIP promising recursive verifier in circom groth16 (40M constraints, so < 1 minute in cloud) for plonky2 without lookups. Plonky2 has a lot of optimizations, for instance skipping the top hashes in the Merkle tree, in exchange for longer commitments and proofs – they also recently added LogUp lookups. plonky2 is going to be replaced by plonky3 which isn’t ready yet, so it’s a bit unclear what to implement in if you were to start now. We are optimistic for a recursive FRI verifier in PLONK – this one is right-field specific, so doesn’t work for Goldlocks, but this WIP one should be flexible to plonky2’s native Goldilocks.
plonky3 + SP1/Lita: Unlike what you might expect, there is no PLONK in plonky3. It is a STARK implementation by the same core team (Daniel Lubarov etc). Early benchmarks show it is likely the fastest STARK prover out there, for good reason. It has several insane improvements. For instance, they inherited Merkle caps from plonky2 that skips the top few layers of the Merkle tree verification and replaces them with a set check, and a form of parallelized AIR matrices that let you group high degree AIR tables together so that you minimally waste hashes (MMCP?). SP1/Lita offer Rust interfaces by which people very easily create plonky3 proofs, and SP1 includes chain-friendly recursive proofs.
Starky: This is a STARK proving language built in the same monorepo as plonky2 (they both have a FRI backend) – due to only doing hashing instead of elliptic curve ops, it’s still faster proving than PLONK. It is super early; i.e. plonky2/starky keccak is still underconstrained.
Winterfell: Early STARK implementation that is very well written with good documentation and is easier to use and learn, but due to it’s earliness we don’t expect it to be as fast. Maybe Polygon Miden has some good closed source implementations?
GKR/Sumcheck: This seems to perform very well in Modulus Labs’ tests. It was a SNARK precursor from 2008 that has recently been combined with SNARK work to lead to some big speedups. Thaler has a good breakdown – GKR and sumcheck is still very relevant, and in fact is one of the main advances that Lasso uses to beat the usual grand product argument. We hear that one of the big plans for proving system improvements is migrating the default lookups (usually grand product style via the PLONK paper) to a LogUp-based scheme based on GKR. Lasso/Jolt is also based on GKR, and they point out accurately that small scalar MSMs do in fact lead to decent (but not the claimed 10-40x) efficiency gains. Lasso has been reasonably criticized since its not as novel as they imply, mostly a synthesis of existing techniques, and their open source code is incomplete. Note that Jolt has no ZK and will not till around October 2024, so we don’t recommend it for consumer apps. I can see it this line of work continuing to evolve in the future, and we expect the state of the art to be added to halo2 when ready. Custom lookups along this line of work (like cq for instance) may be fruitful if you need to squeeze efficiency out of your prover for massive proofs, like ML proofs.
Noir: This is an experimental new compiler by Aztec, that compiles a higher level language into any number of proving backends. It’s one of the easier languages to get started in – a lot of docs, optimized dev experience, and more intuitive mental models (i.e. if statements are supported). Unfortunately, multiple backends isn’t actually a big gain over circom right now, as circom in practice has more: Nova, a very slow PLONK, and STARKs as well, even though it wasn’t built for them. While their new backend of Honk + Goblin Plonk + Protogalaxy is faster in theory with only 16kB proofs (but won’t be audited till 2025), the poor precompiles and unconstrained witness generation speed make it quite slow to use in practice. Goblin Plonk specifically is powerful to speed up PLONKish type systems for recursive PLONK proofs with repeated structure (like hashes or regex). The Noir RSA circuit in their benchmarks as of June 2024 was insecure and underconstrained – the older, full version that Richard ran an RSA benchmark on had Noir circuits taking ~10x longer to prove (i.e. ~30 sec to prove browser-side in Circom, took about ~6 minutes in Noir). As of July 2024, Zac has a faster lookup-based version + new bigint library that seems promising. Their browser-based code has not been optimized at all yet, so even though constraint count is lower, browser benchmarks I’ve seen have been worse. I can see a hybrid system with the Noir-generated R1CS and circom’s optimized provers being a great combination, but I haven’t seen it tried yet.I’m curious to see benchmarks of circom’s r1cs constraints compiled in Noir. I think they should focus on webGPU, optimizing memory query reordering, and general WASM optimizations – if they do this, I expect Noir to be much better than circom in 2025.
Folding Schemes: Nova-based schemes, explained excellently by Taiko, are expected to make competitive client-side proofs due to parallelizable recursion in schemes like ParaNova. You can also prove in any curve with a cycle (including secp256k1, Ethereum’s curve) directly, making most computation faster by avoiding wrong-field arithmetic. Projects including Nova Scotia have made making circuits with repeated structure (like hashing or regex matching) accessible to developers via circom, and Sonobe makes it easy to write Nova in Rust with a ~25 million constraint recursive groth16 proof automatically. The main problem is that the Nova schemes are not actually zero knowledge right now, except the next version of HyperNova – Hypernova is more efficient via combining Nova with sumcheck, CCS, and maybe even zero knowledge. Nova even has lookups. Protostar generalizes Nova from R1CS to PLONK style systems, and Lev is working on a WIP implementation. Most folding folds two instances into one; however, ProtoGalaxy folds multiple at once into one more efficiently – the paper for ProtoGalaxy is reportedly the most accessible folding paper, so if you’re interested in the academic details we recommend starting there, and there are Rust and C++ implementations of it (not zero knowledge though, and recursive proofs are 32M+ constraints). The sensationalized bug in the Nova implementation has been fixed – contrary to Twitter comments, this is NOT an existential bug and can be easily remedied. Looking forwards to seeing more accelerated Nova benchmarks and generalized compilers to convert arbitrary existing ZK code into HyperNova etc as well, which will probably help with large circuits like zkevms in Nova, which we have yet to see comparative benchmarks for.
Binius: This is the client-side proof endgame in my opinion. In my initial WASM benchmarks, unparallelized Binius with an optimized Keccak in the browser is around 8x faster than parallelized circom for Keccak in the browser. With parallelization, this could get as high as 10-40x faster. The speed admittedly primarily comes from the commitment scheme being prover-optimized so having massive proofs, but I expect it to still be competitive with more succinct commitment schemes. Due to using binary fields aka bits (like the latest MPC protocols), it’s super fast because it can avoid EC operations and FFTs. Note that it will open to developers on July 2024, and only be audited likely by 2025.
Spartan: Lets you do right-field arithmetic on ECDSA because it supports more elliptic curves, so spartan-ecdsa is both fast and performant. Circuits are harder to write and there isn’t much tooling, but for just proof of Ethereum signatures, this is pretty good.

In conclusion, for writing a new circuit of your own we recommend:

circom for client side proofs – you can use snarkjs directly in browser if the proof is small, or compile to hypernova/supernova via Arecibo or Sonobe if you’d like for 5-10x faster client side proofs. Also good for server-side proofs you want to prove on-chain, where privacy is less critical. Best tooling, lectures to learn (more general, circom-specific), ease of switching to Nova/STARKs, simple repl to get started, and library of existing circuits.
SP1 for server side proofs (where client-side privacy is less critical). Some theoretical risks of grinding not having an academic proof. Best for succinctness proofs like ZK EVMs, or things that are OK requiring specialized hardware for speed.
noir for some client-side proofs if you’re ok writing most cryptographic libraries yourself (so proofs are likely primarily used for privacy, not succinctness).
supernova/hypernova via circom for ultra-fast client side proofs. On-chain Nova verification will require an SP1 verifier that scales with the size of the Nova circuit, but it shouldn’t be too bad.
ezkl for zk machine learning. It handles nonlinearities like relu or floating point ops very well, and the devops right now is far better than Modulus Labs’ since you can directly compile Pytorch. I do think Modulus Labs’ strategy of using GKR is probably the long-term fastest, but will require highly custom code and expertise for a while.

I try to drink my own Kool-Aid, so we have used this logic to prioritize what to put into production for zk email, which is circom/hypernova for client side proofs, circom on the server side, teams working on noir, and sp1 incoming. In practice, we haven’t really regretted it yet, and expect that within the next year we will have to rewrite our circuits in whatever the newest fastest proving language is.

More bespoke systems may be marginally faster, but harder to learn and you may have to build more core circuit logic yourself. There are many, many other proof systems that I haven’t mentioned nor gotten time to look at (Lurk, Kimchi/Pickles, Boojum, etc) – until they are more mainstream, many of the core primitives (big int, signatures, hashing algorithms, etc) are likely still being built out, so they are not high priority for us to explore. In addition, they have not been audited as rigorously yet, so security of the implementation is often unproven (i.e. plonky2/starky keccak is still underconstrained). I am open to seeing fair, comparative benchmarks for new proof systems on common operations however, and adding to this list!

ZK Ecosystems and Products

This landscape has a great breakdown of what all of the different prover, networks, L1s, and ZKEVMs are doing. It does a great job breaking down the technical stacks, so I’ve opted to focus on open source ethos here instead.

Ethereum: constantly has a high ethos of working in public, novel ideas, critical public-goods aligned ideological drive, and consistently intellectual honest technical direction. Thusly attracts an extremely high quality of publicly discussed open source protocol and cryptography research, and basically every other chain is playing catch up. Cost per proof will plummet over time as L2s and ZKEVMs reach production. I count all EVM-compatible chains in this as well, including Optimism, Arbitrum, ZK rollups, etc.
SP1/Succinct: A generic ZKVM that directly compiles Rust into plonky3, making it super fast to develop in. Is about 10-40x faster than Risc0 on CPU, although Risc0 has invested a lot into GPU optimization. Is fully open source and MIT licensed; SP1 is the first direct Rust to zk proof system with performant proofs on consumer hardware. The cost of STARKs is of course proof size, but given Succinct’s experience with recursive groth16 verifiers, I expect there to be on-chain verification in 2024.
Risc0: Claims to have MIT licensing, but seeing some of their non-obfusticated code requires signing NDAs – this means you have to trust their audits for security, not the code itself. Uses a custom implementation of STARKs (FRI) to prove RISC-V instruction set directly from Rust. STARKs inherently have larger proof sizes (see their datasheet) (~250 kB) – they are quick to generate (seconds on consumer hardware, less on their cloud cluster), and verification on-chain requires their recursive groth16 prover which can run on a server and doesn’t need to be client-side. As this becomes more efficient i.e. via better lookup algorithms, I hope it becomes the de-facto way that servers prove they are executing the correct open source code. They have a pretty insane 100 GPU cluster for fast, non-private proving. It would actually be pretty cool to publicly verify all Rust server executions via this protocol, and their new thing Bonsai with infra for on-chain calls of arbitrary programs is a pretty interesting model. In general, because of the overhead from not having bespoke circuits, it will be much slower than rolling your own circuits and will be even slower in browser (for the same aforementioned reasons as small field FRI), but we are excited to see how it evolves.
Starkware/Cairo: They were the first in the STARK space and made insane tech breakthroughs and speed benchmarks years before others. I hear Cairo is easy to learn because it is Rust syntax inspired. Unfortunately, all the good teams I knew building on it (i.e. Modulus Labs) left due to poor tooling and poor scaling, and I know other teams for which dealing with the verifier is a bottleneck to their shipping velocity. They claim transpilation can make it compatible with Solidity, but making this process auditable seems near impossible so it is hard to verify full EVM-compatibility (not their fault, mostly inherent with the decision to use a non-Solidity compatible language). Regardless, prior to Sept 2023, I had not recommended using Starkware because of vendor lock-in and since Cairo code had non-commercial restrictions and weird licensing with Polaris, but this choice is worth reconsidering now that the code is open source Apache.
Sismo: Unfortunately, Sismo is now dead. I’m glad a consumer company took stabs at adding zk into user facing apps and auth. I’d love to pick their brain on why they stopped the project – my guess is that they never identified a real product usecase, and drank the ‘web3 identity’ koolaid too hard when the only users were sybil farmers. Unfortunately, their design also had an unfortunate centralized design choice – the core of the protocol that actually verifies the claims is Sismo’s commitment mapper, which is a trusted offchain service that Sismo runs that verifies all the data. This means that Sismo themselves can claim anything they want; since proofs of membership are anonymous, no one has any idea whether or not Sismo has exercised this power. While this assumption may be ok for users in the short term, we expect that in the long term, fully decentralized alternatives will prove more lindy and reliable. To give them credit, their docs were transparent about this assumption.

I think it’s clear that on-chain apps should be built in the Ethereum ecosystem for the time being – we expect more things to be EVM compatible over time so this will be a good bet regardless. Other nascent zk ecosystems include Aztec, Mina, and Miden, but they inherently have less developers and ecosystem activity [and for the most part, have yet to adopt permissive OSS licenses for their end to end stack]. I am excited to see results, benchmarks, and progress from these teams, as they all have commendable and ambitious goals.

Places to Learn

I usually recommend our lecture series at zkiap.com to start writing your first ZK circuits in circom.

For transparency, I have asked for and accepted grants from 0xPARC and the Ethereum Foundation PSE for my work. I sought them out and not the other way around, so I don’t think it explicitly sways my thinking. However, it does give me an availability bias to the tech that I see people near me working on. They are also optimizing for speed in an intellectually honest way that I think it doesn’t sway my opinions, but I acknowledge that it makes me less privvy to details of other ecosystems.

If you are excited by zk and want support to help get oriented in the space, I am happy to answer any questions over Telegram. If you’re looking for ideas, we have open sourced our best ideas for zk email and best ideas for noval projects in crypto generally, and will support any open source developer who wants to build them or any other ambitious idea.