Supra vs. Monad: Towards the Best Parallel Execution of the Ethereum Virtual Machine
September 27, 2025 - 8 min read
Blockchain technology has come a long way. We’ve seen major improvements in how data moves through networks, how consensus is reached, and how data is stored. But there’s still a critical bottleneck: transaction execution efficiency. Two innovative solutions are tackling this challenge head-on. Supra’s conflict specification-aware Block Transactional Memory (“SupraBTM”) and Monad’s 2-phase optimistic parallel execution (“Monad 2PE”) represent fundamentally different philosophies for accelerating EVM transaction processing. While both aim to maximize execution efficiency, they take distinctly different paths when it comes to handling conflicts, scheduling transactions, and optimizing overall system performance.
In this deep dive, we’ll unpack both approaches, examining their unique strengths and the trade-offs that come with each design choice. Empirical results show that SupraBTM can achieve a ~4× speedup over traditional sequential execution. Moreover, SupraBTM has a ~1.5-1.7× speedup over Monad’s approach.
In simple terms, Supra executes in ~50% less time (faster) than Monad. To our knowledge we have the highest performing EVM execution in the world!
Conflict Specification-Aware SupraBTM enhances traditional Software Transactional Memory (STM) techniques for EVM (aka “PEVM” by Rise Team) by incorporating Conflict Specification, which is our novel mechanism that derives conflict specifications and leverages them for efficient execution. Unlike Monad, SupraBTM pre-determines read/write sets through static analysis and transaction input parameters to prevent conflicts proactively during execution.
Here’s how it works:
Static Conflict Analysis of Contracts: When a smart contract is deployed, Specification-Aware SupraBTM (iBTM) performs static analysis to extract the read and write sets safely for all transactions in the block.
Dependency Graph Construction: Transactions are organized into a Directed Acyclic Graph (DAG) based on their conflict specifications from the conflict analyzer. If two transactions have non-overlapping read/write sets, they are marked as independent and can execute in parallel.
Optimized Scheduling: SupraBTM (iBTM) schedules transaction execution based on the DAG, ensuring maximum parallelism while avoiding conflicts upfront.
Efficient Execution: With early conflict detection, execution proceeds with minimal aborts and re-execution, reducing computational overhead.
Conflict-aware Adaptive Execution: The advantage of the conflict specification is that it allows us to determine how many pairwise conflicts exist in a block of transactions. We leverage this for an adaptive implementation that can also handle high-conflict workloads. Specifically, by computing a conflict threshold for each block with minimal overhead, the adaptive approach would fall back to sequential execution; however, while there could be a small overhead with selecting the optimal approach, this is offset by significant execution time savings in highly conflicting workloads.
We used the core data structures provided by PEVM version 0.1.0 and implemented our algorithms on top of them. Specifically, we used a hash map to store the conflictSet, which allowed us to store both the list of dependent transactions and the indegree of each transaction in the DAG. A custom parallel-queue is implemented to manage and enable efficient parallel execution across multiple threads. Empirical results show that SupraBTM can achieve a ~4× speedup over traditional sequential execution. The key advantage of SupraBTM is its proactive conflict prevention and adaptive execution, which eliminates the need for costly transaction re-executions.
Monad proposes an optimistic execution model for EVM, aiming to improve blockchain throughput while preserving existing semantics of EVM. The core principle behind Monad’s approach is optimistic parallel execution, which allows transactions to be executed in parallel under the assumption that they are mostly independent. Here’s how it works:
Phase 1
Optimistic Parallel Execution: Transactions are executed simultaneously in multiple threads without prior knowledge of dependencies in the first phase.
Tracking Read/Write Sets: During this phase, the read set (accounts read) and write set (accounts written or updated) of each transaction are recorded from persistent storage.
Phase 2
Sequential Conflict Detection and Re-execution: In the second sequential validation phase, if a transaction’s read set overlaps with another transaction’s write set, the system identifies a conflict. The affected transaction is then re-executed sequentially to maintain correctness.
Commit: Once transactions are validated, a finalized state is committed to storage.
By leveraging parallel access to the database to fetch the latest state, Monad can achieve significant throughput improvement over sequential execution, as it assumes that conflicts are relatively rare. However, in scenarios where conflicts are high and transactions are large (e.g., DeFi protocols with frequent shared state modifications, which are the majority of Ethereum blocks), this can hinder performance improvement over sequential execution.
Experimental Analysis: SupraBTM (iBTM) vs Monad’s 2PE:
Experimental Setup
We conducted a head-to-head benchmark of SupraBTM and Monad’s 2PE executor on Ethereum workloads. Both executors were deployed and tested inside Docker containers to ensure a consistent environment.
Hardware & Environment
Benchmarks were executed on a dedicated server machine:
Server: f4.metal.medium
CPU: AMD 4564P, 16 cores @ 4.5 GHz
RAM: 192 GB
Storage: 2 × 480 GB NVMe + 2 × 1.9 TB NVMe
Networking: 2 × 10 Gbps NICs
Execution Environment: both executors were run inside Docker to isolate dependencies and ensure reproducibility.
Executor Implementations
SupraBTM (iBTM): Implemented in Rust, built on REVM as the underlying EVM engine.
Monad 2PE: Implemented in C++, leveraging a native C++ EVM implementation. Monad has recently open-sourced this codebase.
Sequential Executor (Baseline): Our in-house sequential executor written in Rust, using the same REVM backend as iBTM.
Workload
We used 10,000 historical Ethereum mainnet blocks:
For each block, we gathered:
Pre-state snapshots.
Full block data.
Raw RLP-encoded block data.
These were parsed into the required formats for each executor.
Execution Flow
Before execution, the pre-state is initialized in persistent storage. During execution, each executor fetches state/storage values directly from this persistent layer.
We measure execution time starting from the point where actual transaction execution begins:
For Monad 2PE, we use the first-phase execution time (made measurable via their built-in timer instrumentation).
For SupraBTM (iBTM) and sequential executors, timing starts at block execution entry.
Notably, in our DevNet, block data is first converted into a REVM-compatible format:
Methodology
Each block is executed exactly once per executor.
No repeated runs or averaging were performed—numbers are based on single executions per block, consistent across all executors.
Analysis: Based on the execution comparison data with 8 threads, SupraBTM (iBTM) consistently shows significant speedup over sequential execution with an expected speedup range of 3-7× for most blocks. Moreover, performance gains increase with block complexity. SupraBTM (iBTM) typically outperforms Monad 2PE and demonstrates a more consistent performance advantage across different block sizes.
Block-Level Extremes: Pre- and Post-Merge
Post-Merge Extremes
Best iBTM vs SequentialBlock 21635407 (260 txs) → iBTM ran 6.43× faster
Worst iBTM vs Sequential Block 21631268 (190 txs) → iBTM dropped to 0.58×
Best Monad 2PE vs SequentialBlock 21632191 (63 txs) → Monad achieved 4.33× speedup
Worst Monad 2PE vs Sequential Block 21634397 (8 txs) → Monad collapsed to 0.05×
Best iBTM vs Monad 2PEBlock 21635484 (26 txs) → iBTM was 14.66× faster than Monad.
Worst iBTM vs Monad 2PEBlock 21632191 (63 txs) → iBTM is 0.25× underperforming against Monad.
Pre-Merge Extremes
Best iBTM vs SequentialBlock 14004745 (342 txs) → iBTM delivered a 6.07× speedup
Worst iBTM vs SequentialBlock 14004589 (3 txs) → iBTM collapsed to 0.34×speedup
Best Monad 2PE vs SequentialBlock 14002522 (431 txs) → Monad peaked at 7.07× speedup
Worst iBTM vs Monad 2PE Block 14004589 (3 txs) → iBTM lagged at only 0.108× compared to Monad.
Monad claims to achieve significant improvements from persistent storage optimizations. However, given support for asynchronous concurrent accesses to state stored in the persistent storage, there is no inherent difference between our two approaches, specifically in terms of storage access performance. Both methods leverage parallel execution techniques that allow multiple transactions to access storage efficiently, meaning any claimed advantages in this area would likely apply equally to both systems given support for the concurrent accesses to storage.
Comparative Analysis
Implementation Differences: Monad’s EVM chain is built on C++ and leverages Just-In-Time compilation for faster execution of contracts. Supra’s EVM chain is built on top of (Rust-based) REVM and hence the parallel execution engine (SupraBTM) is also built in Rust.
Throughput and Latency: While both approaches aim to enhance throughput, SupraBTM’s use of access specifications allows for more efficient scheduling, resulting in higher throughput and lower latency (faster), especially in environments with frequent transaction conflicts.
Complexity and Implementation: We remark that validation of the transaction’s read set is necessary for any optimistic approach. The only way to avoid the linear (in the size of the transaction’s read set size) is by constructing the conflict dependency set prior to determining if a transaction’s state has been invalidated. Monad’s approach requires mechanisms to track read and write sets dynamically and handle read set invalidations and hence their need for re-executions, which can introduce runtime overhead and additional latency (slower). SupraBTM, however, relies on conflict analysis to derive access specifications, potentially increasing the complexity during the smart contract deployment phase one time, but simplifying runtime execution through novel mechanisms for minimizing transaction validation complexity thereafter.
While both models aim to optimize parallel execution, Monad’s approach is more flexible but susceptible to re-execution overhead, whereas SupraBTM is more deterministic with lower runtime conflict costs and thus scales better.
Which Approach Is Better?
The choice between SupraBTM and Monad depends on the specific requirements of a blockchain application:
SupraBTM (iBTM) is better suited for higher throughput environments with dynamic conflict ratios, such as DeFi and on-chain order books, where predictable execution and minimal rollbacks improve efficiency. This is made possible by the use of conflict analysis in conjunction with our optimized transactional algorithms that exploit parallelism to the fullest.
Monad’s 2PE Execution is ideal for workloads with small transactional blocks. In other words, when there are not a lot of transaction throughput requirements; not better for large scaling.
Ultimately, as blockchain systems continue to evolve, hybrid approaches such as adaptive execution or those that combine optimistic execution with conflict specification-aware scheduling may emerge as the gold standard for parallel execution. The advantage of the conflict specification model proposed in SupraBTM (iBTM) is that it allows us to determine how many pairwise conflicts exist in a block of transactions. We are able to demonstrate how we can leverage this for an adaptive implementation that can also handle high-conflict workloads. Specifically, by computing a conflict threshold for each block of transactions with minimal overhead, we deterministically fall back to sequential execution in case of high conflicts. We believe this will allow us at Supra to leverage statistical AI techniques in the future for adaptive parallel execution algorithms.
Caveats
For Monad’s Two-Phase Execution (2PE), time is measured only between the clock start at L234 and the clock end at L282, instruction inserted by Monad’s team, we later moved the clock end to L293 to mark the true end of 2PE execution.