a16z: How to Phased Approach Secure and Efficient zkVM Implementation (A Must-Read for Developers)

By: blockbeats|2025/03/12 20:15:02
0
Share
copy
Original Title: The path to secure and efficient zkVMs: How to track progress
Original Author: a16z crypto
Original Translation: Golem, Odaily Planet Daily Golem

zkVM (zero-knowledge Virtual Machine) promises to "make SNARKs mainstream," allowing anyone (even without specialized SNARK expertise) to prove that they have correctly executed any program on a given input (or witness). Their core advantage lies in developer experience, but currently they face significant challenges in security and performance. To fulfill zkVM's vision, designers must overcome these challenges. In this article, I outline the possible stages of zkVM development, which will take several years to complete.

Challenges

In terms of security, zkVM is a highly complex software project still riddled with vulnerabilities. In terms of performance, the speed of proving program correctness can be orders of magnitude slower than native execution, making it impractical for most applications to deploy in the real world.

Despite these real-world challenges, most companies in the blockchain industry portray zkVM as ready for immediate deployment. In fact, some projects have already paid significant computational costs to generate proofs of on-chain activity. However, because zkVM is still imperfect, this is merely a costly way of pretending the system is SNARK-protected when, in reality, it is either protected by permission or, worse, vulnerable to attacks.

We are still several years away from achieving a secure and high-performance zkVM. This article proposes a series of phased specific goals to track the progress of zkVM—goals that can eliminate hype and help the community focus on real advancement.

Security Stage

SNARK-based zkVMs typically include two main components:

· Polynomial Interactive Oracle Proof (PIOP): Used to prove statements about polynomials (or constraints derived from them) in an interactive proof framework.

· Polynomial Commitment Scheme (PCS): Ensures that the prover cannot lie about polynomial evaluations without being caught.

zkVM fundamentally transforms efficient execution into a constraint system—broadly meaning that it enforces the virtual machine to correctly use registers and memory—then applies SNARKs to prove that these constraints are satisfied.

Ensuring that complex systems like zkVM are error-free is done through formal verification. Below is a breakdown of the security phases. Phase 1 focuses on the correct protocol, while Phase 2 and Phase 3 focus on the correct implementation.

Security Phase 1: Correct Protocol

1. Formal verification proof of PIOP reliability;

2. Formal verification proof that PCS has efficacy under certain cryptographic assumptions or ideal models;

3. If using Fiat-Shamir, a concise argument obtained by combining PIOP and PCS is secure in the random oracle model formal verification proof (enhanced with other cryptographic assumptions as needed);

4. Formal verification proof that the constraint system employed by PIOP is equivalent to the VM's semantics;

5. Binding all the above parts into a unified, formally verified secure SNARK proof for executing any program specified by the VM bytecode. If the protocol aims for zero-knowledge properties, this attribute must also be formally verified to prevent leakage of sensitive information about witnesses.

Recursive Alert: If zkVM employs recursion, every PIOP, commitment scheme, and constraint system involved anywhere in the recursion must be verified to consider this phase complete.

-- Price

--

Security Phase 2: Correct Validator Implementation

Formally verify the implementation of zkVM's validator (using Rust, Solidity, etc.) matches the protocol validated in Phase 1. Achieving this ensures that the implemented protocol is sound (not just a paper design or an inefficient specification written in Lean, for example).

There are two reasons why Phase 2 concerns solely the validator implementation (not the prover). First, ensuring the correct usage of the validator is sufficient to guarantee reliability (i.e., ensuring the validator cannot trust any false statements to be true). Second, the implementation of the zkVM validator is over an order of magnitude simpler than the prover's implementation.

Security Phase 3: Correct Prover Implementation

The actual implementation of the zkVM Prover correctly generates the proof system for Phase 1 and Phase 2 verification, i.e., achieves formal verification. This ensures completeness, meaning that any system using zkVM will not be "stuck" with unprovable statements. If the Prover intends to achieve zero-knowledge, this property must be formally verified.

Expected Timeline

· Phase 1 Progress: We can expect incremental progress next year (e.g., ZKLib). However, it will take at least two years before any zkVM can fully meet the requirements of Phase 1;

· Phase 2 and Phase 3: These phases can progress alongside some aspects of Phase 1. For example, some teams have already demonstrated that the Plonk Prover's implementation matches the protocol in the paper (although the protocol in the paper itself may not be fully validated). Nevertheless, I expect that no zkVM will reach Phase 3 in less than four years—and possibly longer.

Key Points: Fiat-Shamir Security and Verified Bytecode

A major complicating factor is the unresolved research issues surrounding the security of the Fiat-Shamir transformation. All three phases treat Fiat-Shamir and the random oracle as part of their bulletproof security, but in reality, the entire paradigm may have vulnerabilities. This is due to an over-idealization of the random oracle and differences with practical hash functions. In the worst case, a system that has reached Phase 2 could later be found to be entirely insecure due to Fiat-Shamir issues. This has raised serious concerns and ongoing research. We might need to modify the transformation itself to better mitigate such vulnerabilities.

Non-recursive systems are theoretically more robust because certain known attacks involve circuits similar to those used in recursive proofs.

Another point to note is that if the bytecode itself is flawed, then even if the proof of correct execution of a computer program (specified through bytecode) has run successfully, its value is limited. Therefore, the practicality of zkVM largely depends on the method of generating formally verified bytecode—a significant challenge beyond the scope of this article.

Regarding Post-Quantum Security

At least for the next five years (possibly longer), quantum computing will not pose a serious threat, while vulnerabilities present a survival risk. Thus, the primary focus now should be on meeting the security and performance stages discussed in this article. If we can achieve these security requirements more quickly with non-quantum-secure SNARKs, then we should do so until post-quantum SNARKs catch up or serious concerns about cryptographically relevant quantum computing arise for consideration.

zkVM Performance Status

Currently, the zkVM prover's overhead factor is close to 1,000,000 times the native execution cost. If a program takes X cycles to run, the cost to prove correct execution is approximately X multiplied by 1,000,000 CPU cycles. This was the case a year ago, and remains so today.

Popular narratives typically describe this overhead in a way that sounds acceptable. For example:

· "The cost to generate a proof for all Ethereum mainnet transactions in a year is less than one million dollars."

· "We can almost generate Ethereum block proofs in real time using a cluster of tens of GPUs."

· "Our latest zkVM is 1,000 times faster than its predecessor."

While technically accurate, these statements can be misleading without proper context. For example:

· It is 1,000 times faster than the old version of zkVM, but the absolute speed is still very slow. This more reflects how bad things were rather than how good they are.

· There have been proposals to increase the computational load on the Ethereum mainnet by a factor of 10. This would make the current zkVM performance slower.

· What is referred to as "near real-time proof of Ethereum blocks" is still much slower than what many blockchain applications require (for example, Optmism has a block time of 2 seconds, much faster than Ethereum's 12-second block time).

· "A cluster of tens of GPUs always running flawlessly" cannot achieve an acceptable liveness guarantee.

· Spending less than one million dollars per year to prove all activity on the Ethereum mainnet reflects the fact that an Ethereum full node only needs to spend about $25 per year to perform computation.

For applications outside of blockchain, such overhead is clearly too high. No amount of parallelization or engineering can offset such enormous overhead. We should take as a basic benchmark that zkVM's slowdown compared to native execution does not exceed 100,000 times—even if this is just the first step. True mainstream adoption may require overhead closer to 10,000 times or lower.

Measuring Performance

SNARK performance has three main components:

· Underlying proof system's intrinsic efficiency.

· Application-specific optimizations (e.g., precompilation).

· Engineering and hardware acceleration (e.g., GPU, FPGA, or multi-core CPU).

While the latter two are crucial for real-world deployment, they typically apply to any proof system, so they may not necessarily reflect the underlying overhead. For example, adding GPU acceleration and precompilation in zkEVM can easily achieve a 50x speedup, much faster than a purely CPU-based approach without precompilation—enough to make an inherently less efficient system appear superior to one that has not been similarly polished.

Therefore, the focus below is on the performance of SNARK without specialized hardware and precompilation. This is different from the current benchmarking approach, which often lumps all three factors into a single "headline number." This is akin to judging the value of a diamond based on its polishing time rather than its inherent clarity. Our aim is to eliminate the intrinsic overhead of a generic proof system—helping the community eliminate confounding variables and focus on true progress in proof system design.

Performance Phases

Here are 5 milestones of performance achievement. First, we need to reduce the verifier's overhead on the CPU by several orders of magnitude. Only then should the focus shift to further reductions through hardware. Memory usage also needs to increase.

Across all stages below, developers should not have to implement custom code specific to zkVM to achieve the necessary performance. Developer experience is a key advantage of zkVM. Sacrificing DevEx to meet performance benchmarks would contradict the essence of zkVM itself.

These metrics focus on the prover's cost. However, if unlimited verifier cost is allowed (i.e., no bounds on proof size or verification time), any prover metric can be easily achieved. Therefore, for systems to adhere to the stages described, maximum values for proof size and verification time must be specified.

Performance Requirements

Phase 1 Requirement: "Reasonable and Nontrivial Verification Cost":

· Proof Size: The proof size must be smaller than the witness size.

· Verification Time: The speed of verifying the proof must not be slower than running the program natively (i.e., performing the computation without a correctness proof).

These are the minimal and succinct requirements. They ensure that the proof size and verification time are not worse than simply sending the witness to the verifier and having the verifier directly check its correctness.

Requirements for Phase 2 and Beyond:

· Maximum Proof Size: 256 KB.

· Maximum Verification Time: 16 milliseconds.

These cutoff values are intentionally set high to accommodate new fast proof technologies that may bring higher verification costs. At the same time, they exclude very expensive proofs that few projects would be willing to include on the blockchain.

Speed Phase 1

A single-thread proof must be at most 100,000 times slower than native execution, measured across a range of applications (not just proofs of Ethereum blocks) and not relying on precompiles.

Specifically, think of a RISC-V process running at about 30 billion cycles per second on a modern laptop. Achieving Phase 1 means you can prove at a rate of approximately 30,000 RISC-V cycles per second on the same laptop (single-threaded). However, the verification cost must be as mentioned above, "reasonable yet non-trivial."

Speed Phase 2

A single-thread proof must be at most 10,000 times slower than native execution.

Alternatively, due to some promising SNARK techniques (especially those based on binary fields) being hindered by current CPUs and GPUs, you can reach this stage by comparing against using an FPGA (or even an ASIC):

The number of RISC-V cores an FPGA can simulate natively;

The number of FPGAs required to simulate and prove (near-) real-time execution of RISC-V.

If the latter is at most 10,000 times more than the former, you qualify for Phase 2. On a standard CPU, the proof size must be at most 256 KB, and the validator time must be at most 16 milliseconds.

Speed Phase 3

In addition to achieving Speed Phase 2, you can also use automatically synthesized and formally verified precompiled implementations with proof costs of less than 1,000 times (suitable for a wide range of applications). Essentially, you can customize an instruction set dynamically for each program to accelerate the proof, but it needs to be done in an easy-to-use and formally verified manner.

Memory Phase 1

The speed in Phase 1 is achieved with the prover requiring less than 2 GB of memory (while also achieving zero-knowledge).

This is crucial for many mobile devices or browsers, opening up countless client-side zkVM use cases. Client-side proofs are important because our phones are our ongoing contact with the real world: they track our location, credentials, etc. If generating a proof requires more than 1-2 GB of memory, that's too much for most of today's mobile devices. Two points need to be clarified:

· The 2 GB space limit applies to large statements (statements that require trillions of CPU cycles to run locally). Proof systems that only implement a space limit for small statements lack broad applicability.

· If a prover is very slow, it's easy to keep the prover's memory footprint below 2 GB. Therefore, in order to make Stage 1 memory non-trivial, I require Stage 1 speed to be met within this 2 GB space limit.

Memory Stage 2

Stage 1 speed is achieved with a memory footprint of less than 200 MB (10 times better than Memory Stage 1).

Why push it below 2 GB? Consider a non-blockchain example: every time you visit a website through HTTPS, you download a certificate for identification and encryption. Instead, the website could send zk proofs with these certificates. A large website could issue millions of such proofs per second. If each proof requires 2 GB of memory to generate, that would require a total of PB-level RAM. Further reducing memory usage is crucial for non-blockchain deployments.

Precompiles: The Last Mile or a Crutch?

In zkVM design, precompiles are specialized SNARKs (or constraint systems) tailored for specific functions, such as Keccak/SHA hashing for digital signatures or elliptic curve group operations. In Ethereum (where much of the heavy lifting involves Merkle hashing and signature checking), some hand-crafted precompiles can reduce the verifier's costs. However, relying on them as a crutch does not allow SNARKs to achieve their intended purpose. Here's why:

· Still too slow for most applications (both internal and external to blockchains): Even with hash and signature precompiles, the current zkVM is still too slow (both inside and outside blockchain environments) due to the inefficient core proof system.

· Security Failures: Handwritten precompiles not formally verified are almost certainly riddled with bugs, potentially leading to catastrophic security failures.

· Poor Developer Experience: In most zkVMs today, adding a new precompile means manually writing a constraint system for each functionality — essentially reverting back to a 1960s-style workflow. Even with existing precompiles, developers must refactor code to invoke each precompile. We should optimize for security and developer experience rather than sacrificing both in pursuit of incremental performance gains. Doing so only proves that performance has not met its true potential.

· I/O Overhead and Lack of RAM: While precompiles can improve performance for heavy cryptographic tasks, they may not provide meaningful acceleration for more diverse workloads as they incur significant overhead when handling input/output and cannot use RAM. Even in a blockchain context, as soon as you move beyond a single L1 like Ethereum (e.g., if you want to build a series of cross-chain bridges), you encounter different hash functions and signature schemes. Redoing precompiles over and over for the same problem is not scalable and poses significant security risks.

For all these reasons, our primary task should be to enhance the efficiency of the underlying zkVM. The technology that produces the best zkVM will also produce the best precompiles. I do believe precompiles will remain crucial in the long run, but only if they are auto-synthesized and formally verified. This way, we can maintain the developer experience advantage of zkVM while avoiding disastrous security risks. This viewpoint is reflected in Speed Phase 3.

Expected Timeline

I expect a few zkVMs to achieve Speed Phase 1 and Memory Phase 1 later this year. I also anticipate Speed Phase 2 to be achieved within the next two years, although it is currently unclear if we can reach this goal without some yet-to-emerge new ideas. I predict the remaining phases (Speed Phase 3 and Memory Phase 2) will take a few more years to accomplish.

Conclusion

While I have delineated the security and performance stages of zkVM separately in this article, these aspects of zkVM are not entirely independent. As more vulnerabilities are found in zkVM, it is expected that some vulnerabilities can only be fixed at the cost of a significant performance hit. Performance should be deferred until zkVM reaches Security Phase 2.

zkVM promises to truly democratize zero-knowledge proofs but is still in its infancy — filled with security challenges and significant performance overhead. Hype and marketing make it difficult to assess true progress. By outlining clear security and performance milestones, this roadmap aims to provide a distraction-free path forward. We will achieve the goals, but it will take time and sustained effort.

Original Article Link

You may also like

Deposit Smarter & Faster: Discover WEEX’s Powerful Upgrade for Crypto and Fiat Deposit

Tired of slow, complicated crypto deposits? WEEX has completely upgraded the process.

Morning Report | Deloitte acquires crypto infrastructure company Blocknative; stablecoin company Checker completes $8 million financing; a16z may have become the largest external institutional holder of HYPE

Overview of Important Market Events on May 20

WEEX New Navigation: Trade Faster, Find Trends, Copy Top Traders Instantly

TL;DRYou can now discover trending trading pairs faster with dedicated Spot and Futures themes.The platform automatically resumes your last viewed trading pair so you never lose your place.Simply hover over Copy Trade to see top-performing traders from the past three weeks.This upgrade is rolling out now to all WEEX users. Trading Slows You Down? Here’s How We Fix ItEvery tap, scroll, and menu click steals seconds from your trading. Missed trends. Lost progress. Hidden copy traders. Scattered asset views. That’s why WEEX rebuilt the top navigation — to remove every speed bump. Now you can catch trending pairs before they run, pick up exactly where you left off, find top-performing copy traders in one hover, and check your wallet without digging through menus. Less friction, faster decisions. 5 Navigation Boosts That Make You Trade SmarterHere’s what changed and why it matters for your daily trading:Trending Spot & Futures themes: Switch market themes and spot high-momentum pairs instantly. No more hunting.Auto-resume last pair: WEEX remembers your last viewed pair. Come back anytime, keep trading.One-hover copy trader discovery: Hover over Copy Trade to see top 3-week traders. Click into their pages instantly.Wallet shortcut in main navigation: One click to check balances, deposits, and withdrawals.Combined language & currency selector: View and switch both settings together. Personalize in seconds. Who Gets the Biggest Speed Gains?Everyone wins, but high-frequency and copy traders see the biggest efficiency jump.New traders learn faster — clean layout and visible copy trading lowers the barrier.Active spot & futures traders save minutes per session with auto-resume and trending filters.Copy trading followers find top performers in one hover instead of multiple clicks.Frequent asset movers love the Wallet shortcut for quick deposits and balance checks.Find New Navigation Features in Two ClicksHere's how you can find the new navigation features on WEEX homepage within a few clicks:Log in and look at the top bar (Spot, Futures, Copy Trade, Wallet, globe icon).Hover Copy Trade → see top 3-week traders.Click Spot or Futures → your last viewed pair loads automatically.Click the globe → switch language and currency side by side.Click Wallet → view all assets instantly. Start Trading Smarter — It’s Live NowThese upgrades are available now. No settings to enable. No extra cost.👉 Try the new WEEX navigation nowTrade faster. Miss less. WEEX. Thank you for trading with WEEX About WEEXFounded in 2018, WEEX has developed into a global crypto exchange with over 6.2 million users across more than 150 countries. The platform emphasizes security, liquidity, and usability, providing over 1,200 spot trading pairs and offering up to 400x leverage in crypto futures trading. In addition to the traditional spot and derivatives markets, WEEX is expanding rapidly in the AI era delivering real time AI news, empowering users with AI trading tools, and exploring innovative trade to earn models that make intelligent trading more accessible to everyone. Its 1,000 BTC Protection Fund further strengthens asset safety and transparency, while features such as copy trading and advanced trading tools allow users to follow professional traders and experience a more efficient, intelligent trading journey.Follow WEEX on social mediaX: @WEEX_OfficialInstagram: @WEEX ExchangeTiktok: @weex_globalYoutube: @WEEX_OfficialDiscord: WEEX CommunityTelegram: WeexGlobal Group

Morning Report | Musk's xAI launches Skills; Duan Yongping to first build position in Circle in Q1 2026; Polymarket partners with Nasdaq to launch prediction market

Overview of Important Market Events on May 19

Vitalik: What is the key to the next phase of Ethereum?

"Code is law" — this is one of the earliest beliefs in the blockchain world. But what if the code itself has bugs? What if AI makes bugs ubiquitous? This is the question that Vitalik's latest long article attempts to answer.

Interlace: A global leader in Agentic Payment and stablecoin infrastructure platform, building the next generation of digital financial foundation

Interlace has launched two innovative products, Agent Card and Scan to Pay, bridging traditional finance and the crypto world, and comprehensively accelerating the integration of AI Agent consumption and stablecoin payments into everyday business scenarios with a more secure and efficient enterprise...

Popular coins

Latest Crypto News

Read more
iconiconiconiconiconiconicon
Customer Support:@weikecs
Business Cooperation:@weikecs
Quant Trading & MM:bd@weex.com
VIP Program:support@weex.com