Research
5 min read

AlDBaran: Chasing 50M State Updates a Second for Eclipse’s GigaCompute, Part 1

Written by
Eclipse Labs
Published on
June 30, 2025

Key Takeaways

Abstract

Efficient state management is a significant bottleneck for high-throughput blockchains. AlDBaran removes that long standing bottleneck by decoupling the in-memory (Pleiades) and historical (Hyades) state commitment systems. A new append-only, twig-based Sparse-Merkle structure turns each read into a single SSD access, giving updates in O(1) I/O. Because every entry retains linked versions, Hyades can serve inclusion and exclusion proofs for any past block without full-state snapshots. This approach is fully in line with the principles laid out in the Eclipse’s Performance Thesis, which involves careful hardware-software codesign in the Eclipse’s L2 context in a manner that is typically not possible for an L1.

The goal we set for ourselves with GigaCompute and GSVM Eclipse client was 1M TPS. AlDBaran can sustain a throughput in excess of 8M TPS; this corresponds to over 24M state updates per second with historization enabled. This is 10-20x faster than QMDB, the first authenticated database that we are aware of to surpass 1M updates per second. With historization disabled, Pleiades sustains over 48M updates per sec in steady state, and even spikes to a 60M updates per sec peak. By decoupling hot-path state commitment computation from archival storage AlDBaran represents a fundamental leap in authenticated database performance, immensely useful for Eclipse. It is also well-suited to deployment in a datacenter context, where Pleiades, Hyades, and snapshot storage can live on different machines connected with high-speed low-latency links. As a result, we no longer see state commitment computation as a bottleneck for Eclipse. 

Overview

We drew inspiration from both Eclipse’s “bull” ethos, charging forward with unstoppable momentum, and the Taurus constellation, which represents the celestial bull.  Thus our system is named after Aldebaran, the brightest of the stars in the Taurus constellation. Pleiades and Hyades, the largest star clusters in that constellation, represent the in-memory and historical authenticated database components. 

After benchmarking the leading authenticated database engines, such as Firewood, MerkleDB, NOMT, LVMT, and QMDB, we discovered that unfortunately none could clear our 3M account state updates per second target, so we designed Pleiades and Hyades to bridge that gap. Our early results for Pleiades on a 96-core AWS machine show over 24M state updates per second (with historization enabled and 48M updates without), an order-of-magnitude improvement over QMDB.

Motivation: GigaCompute

Solana does not provide state commitments in the same way as Ethereum does. One of the reasons for that is performance: most authenticated databases are too slow to keep up with the SVM. AlDBaran sets out to change all that: we want to demonstrate that it’s possible to have a high-performance authenticated database engine that does not make state commitment computation a bottleneck.

On Eclipse, each transaction touches an average of three accounts, so the state engine must handle at least 3M state updates per second to reach our goal of 1M TPS. No off-the-shelf authenticated database we investigated gets remotely close, because they suffer from at least one of the three performance shortcomings: 

  • Disk I/O on the critical path: a single fsync reduces throughput from millions to thousands of operations per second.
  • Global mutexes: thread contention at the root reduces throughput and significantly increases tail latency.
  • Coarse-grained snapshotting: supporting exact block snapshots and verifiable proofs requires tagging every leaf with a per-update version at microsecond precision, a capability most Merkle tree libraries do not natively provide.

AlDBaran Architecture: Pleiades and Hyades

AlDBaran Systems Architecture - a typical blockchain pipeline

At its core, AlDBaran sits between the GSVM execution layer and the network-facing RPC/light-client tier, desegregating state management into two purpose-built subsystems, Pleiades for sub-microsecond, in-RAM updates, and Hyades for inclusion and exclusion proof generation. These two components are bridged by snapshots.

When the execution engine closes a block, three streams proceed in parallel:

  1. [State updates] Execution enginePleiades
    As each (key, new-value) arrives, Pleiades immediately applies it to its sparse, twig-sharded Merkle tree entirely in DRAM.  Updates and root-hash computation run entirely asynchronously to execution; once all updates are applied, Pleiades emits a single 32-byte root hash.
  2. [State commitment] Pleiades →  Snapshots
    The state updates are then processed at the snapshot storage, where we compact the data for later usage in Hyades. This process is asynchronous, meaning that it can be computationally expensive with some flexibility to schedule it as eager or lazy computation. 
  3. [Reconstructed historical state] Snapshots → Hyades
    The Hyades ingests transactions into an append-only proof log. Here, indexing by block height prepares the historical state for consumption by RPC nodes. 
  4. [Inclusion proofs] Hyades → RPC nodes
    The RPC nodes request inclusion and exclusion proofs that are prepared and served by Hyades. These may later be forwarded to light clients. We can also use multicast for this purpose. 

This split architecture keeps every hot-path operation in Pleiades (hashing, in-RAM updates) completely free of disk I/O or global locks, while archival and proof-generation work runs asynchronously in Hyades, guaranteeing tens of millions of state updates per second.

Design Principles

To hit multi-million state updates per second on the hot path, Pleiades follows the following design rules:

  1. Thread sharding — the key space is split across cores; no locks on the critical path.
  2. Twig buffering — per-update hashing stops at a per-thread twig root; we finish the top of the tree only once per block.
  3. SIMD batching — we hash sixteen branches in a single CPU instruction bundle.
  4. Deterministic breadth-first layout with prefetch hints — the next node is always in the L2 cache by the time we need it.
  5. No-disk promise — every byte lives in DRAM until the block is committed.

Hyades can use the snapshots at its own pace, creating inclusion and exclusion proofs based on the submitted external queries. The raw data for the state updates can be similarly stored separately, in a desegregated fashion, from the snapshot data. This would allow us to completely eliminate I/O operations from the critical path  resulting in Hyades being able to serve inclusion and exclusion proofs within milliseconds, without blocking the critical path. 

Because Hyades is not on the critical path, it can afford to be disk-bound. As with other components, it could run on completely separate machines in a data center context. We could also get fancy and do more computationally intensive work on the Hyades. For example, Hyades could go beyond inclusion and exclusion proofs and provide range proofs as well.

Proof system

The proof system of AlDBaran is based on Sparse Merkle Trees (SMT) and provides standard inclusion and exclusion proofs for account data. Pleiades is able to compute SMT roots with almost perfect parallelism, thanks to its in-memory representation and optimized hashing flow while Hyades versions Merkle proofs for individual accounts on updates. We'll dive into the details of the proof system in a future article.

Performance Benchmarks & Results

All results below were produced with the pleiades-bench harness compiled under Rust nightly-2025-06-20.  The workload comprises 256-bit uniformly random keys (no cache-friendly prefixes), a mix of updates (90%), inserts (5%), and deletes (5%).  The number of leaves was at least 128M to keep the working set far outside the Last-Level-Cache.

For our measurements, we rented an AWS i7ie-metal-48xl server machine. It comes with two Xeon 8559C CPUs with overall 96 cores and 192 threads.  It is equipped with 1,536 GB RAM as 16x 96GB DDR5-5600 and has 16x AWS Nitro v3 NVMe SSDs running in Raid 0 and formatted with xfs. 

Scaling Behavior: Sustaining Line-Speed Processing

Given the asynchronous nature of AlDBaran’s approach, we primarily concern ourselves with the throughput the system delivers, as opposed to latency.  

Solana has roughly one billion accounts active and otherwise. Furthermore, many of the results reported by authenticated database benchmarks investigate the one billion to eight billion key range. Thus we have chosen to provide benchmark numbers at 227 (128  million), 230 ( one billion) and 233 ( eight billion) accounts. 

Under the 1B key setting, the 96-core AWS machine sustained over 48M state updates  per second without Hyades —and even peaked at 60M updates per sec, a near-perfect scaling result, with each CPU core contributing approximately 0.5M state updates per second. Single-core runs peak at 0.64 million updates per second, so end-to-end efficiency sits at roughly 78 percent of the isolated core rate (see aside).

Aside: NUMA & Memory Bandwidth Constraints

On a dual-socket system with a eight-channel DDR5 bus, cross-socket latency and finite memory bandwidth prevent each core from reaching its solo peak. These NUMA effects and bus saturation cap per-core contributions at ~78 percent in full-system runs.

Enabling SMT additionally confirms that the memory-latency is the bottleneck here: hyper-threads improve throughput by 40%. As one thread is waiting on memory, the other can already do useful work.  We  validated  this thesis by using the platform-provided hardware monitoring tools and identifying the longest CPU pipeline stalls as caused by DRAM access.

Enabling historicization via Hyades cuts the performance in half to 24M ups. This is the number that would be fair to use when comparing with other projects, such as QMDB, as historization is an integral part of their implementation. We report both numbers, as our architecture allows us to have history on a separate machine. 

Each update serializes 400 bytes of snapshot data (leaf segment plus inner-node hops).  If we dedicate 64 cores and 128-threads  of the CPU of the host server to the the snapshot stream, this corresponds to a total of 50 Gbps, roughly 50 MB/s per thread, well within the burst capacity of a two PCIe 4 × 4 NVMe drives, sustainably handled by four commodity SSDs, and easily accommodated by a 50 GbE link. We remind the reader that our ed25519 signature verification on a 64-core server software can similarly saturate that very link. 

Fig 3: Scaling with number of worker threads.

With a fixed 128M key space, we varied the number of worker threads handling the updates from 16 up to 512. Throughput rises sub-linearly until the dual-socket memory subsystem saturates. This confirms that Pleiades is memory-bound.

We observe that root hash generation, historically the slowest stage of block production, now occupies a single-digit share of CPU time. For Eclipse, this effectively eliminates state commitment computation as a throughput bottleneck, being able to operate at line speed on fast networks. Furthermore, our implementation of Pleiades does not require a special kernel or accelerator card.

Throughput vs. Key‐Space Size

Fig 4: Scaling with Key Sizes

Our design aims to provide eventual user address space growth. We measured sustained state updates per second over a 10-minute run with three key-space sizes: 128M, 1B, and 8B accounts; the latter could be considered to be overkill. Throughput stabilizes as the store is filled and remains stable thereafter. At 1 billion accounts, we get more than 45M state updates per second, which would correspond to over 15M TPS. Even at 233 accounts (over 8B), well beyond typical chain sizes, the engine sustains nearly 40M state updates per second. 

Comparing with QMDB

We used the publicly available version of QMDB from github to do some of the measurements below. We used the same AWS machine as the one described above; this machine is more powerful than the machines we saw used in the QMDB paper. The QMDB paper reports update numbers that reach 2.28M state updates per second. Sadly, despite our efforts, we were unable to replicate the same numbers using the code in https://github.com/LayerZero-Labs/qmdb (we call this QMDB v0.2.0), although there may be some configuration differences accounting for these discrepancies. We also measured against a more recent code drop of QMDB (https://github.com/LayerZero-Labs/fafo/tree/main/qmdb from June 26, 2025) for completeness; we call this QMDB in the table below. In our measurements we get 1,280,000 TPS for v1 and 803,000 for v2. We used the provided speed and bench-sendeth tools to generate workloads utilizing random keys. In any event, we see improvement of at least an order of magnitude for AlDBaran compared to QMDB. We hypothesize that some of the differences could be caused by not being able to scale beyond a fixed number of cores and excessive use of locking. 

Authenticated Database AlDBaran throughput at 227 accounts Speedup compared to QMDB v0.2.0 Speedup compared to FAFO/QMDB
Pleiades (500 ms commit period) 24,100,000 19× 30×
Pleiades (100 ms commit period) 14,800,000 11× 18×

Related Work

Blockchain state management combines a proof layer (an authenticated data structure, ADS) with a storage layer (a key-value store). Traditional implementations, for example, Ethereum’s Merkle Patricia Trie (MPT) on LevelDB or RocksDB, suffer from high write amplification and O((log N)²) SSD I/Os per state update, leading to I/O-bound execution layers. To overcome these bottlenecks, a new generation of integrated ADS engines has emerged. Below, we summarize five leading designs.

  • Firewood (Avalanche)
    Firewood is a compaction-less, integrated Merkle Trie engine where the on-disk index is the trie itself, stored in a B⁺-tree-like layout. By eliminating the layered KV abstraction and pruning outdated revisions in-place via a Future-Delete Log, it minimizes write amplification and I/O overhead. Designed to hit >10,000 TPS for Avalanche and its Subnets, Firewood is currently in alpha preview but promises significant single-node performance improvements over MerkleDB+RocksDB and a proof-system in the future versions.
  • Avalanche MerkleDB
    MerkleDB implements a Merkle Radix Trie layered on RocksDB, leveraging copy-on-write “Views” and batched commits to optimize block execution. It supports Avalanche’s high-throughput workloads (several thousand TPS) and sub-second finality while providing inclusion, range, and change proofs. MerkleDB benefits from RocksDB’s compression (Snappy, ZSTD) but inherits its write amplification and compaction overhead. It seems to be production-ready, audited by OpenZeppelin in March 2023.
  • Nearly-Optimal Merklization (NOMT)
    NOMT couples a binary Merkle Trie with a flat key-value store, using an SSD-friendly, page-aligned layout to reduce random I/O. Written in Rust, it achieves roughly 43,000 state updates/sec per thread and saturates modern SSDs at multi-GB/s throughput, representing an order-of-magnitude speedup over naive MPT implementations. NOMT offers constant-factor I/O reductions but is still subject to inherent trie asymptotics.
  • Layered Versioned Multipoint Trie (LVMT)
    LVMT layers an append-only Merkle tree atop an Authenticated Multipoint Trie (using a vector-commitment scheme) rather than a standard Merkle Patricia Trie. By storing only compact commitment data in the trie and deferring most hashing to the vector-commitment, LVMT achieves amortized O(1) root commitments (nearly constant time regardless of state size). This lets it sidestep the O(log N) hashing asymptotics of traditional ADS. In experiments on Ethereum-like workloads, LVMT delivered up to 6× faster read/write operations and 2.7× higher transaction throughput compared to a conventional MPT on LSM storage.
  • Quick Merkle Database (QMDB)
    QMDB unifies KV storage and Merkleization into fixed-size, append-only “twigs” that batch 2,048 entries, enabling O(1) SSD I/Os per update, a single SSD read per lookup, and fully in-memory Merkle hashing (2.3 bytes of DRAM per entry). It delivers up to 2.28M updates per second, supporting 1M TPS, scales to 15 billion entries in practice, and adds historical proofs for past-state queries .

Together, these ADS engines illustrate the evolving landscape of blockchain-optimized storage: from layered trie approaches (MerkleDB) through integrated on-disk tries (Firewood), flash-native layouts (NOMT), algebraic commitments (LVMT), to fully unified, high-throughput twig architectures (QMDB). Each design makes unique trade-offs between I/O complexity, memory footprint, proof capabilities, and production maturity.

Conclusions 

Pleiades turns state-root generation from a bottleneck into spare capacity: over 24M state updates per second with history, 48M without and even spikes to a 60M updates per sec peak, using on off-the-shelf hardware, near-linear core scaling, and snapshot bandwidth that barely dents an NVMe lane. These throughput numbers can easily sustain network speeds of 50 Gbps or so at line-speed capacity. The split design, Pleiades for hot state, Hyades for history, means Eclipse’s million-TPS target now runs with double-digit head-room. 

Part 2 will dive into the internals and publish more detailed benchmarks.

Share this post