Research
5 min read

Leveraging Celestia Blobstream for Data Availability Proofs

Written by
Eclipse Labs
Published on
April 24, 2025

Leveraging Celestia Blobstream for Data Availability Proofs

Layer 2 (L2) networks must post data to Layer 1 (L1) or to an alternative data availability (DA) layer so others can reconstruct the L2's state. While the validity of this data is verified through either validity proofs or fraud proofs, a critical question emerges: What happens if an L2 simply refuses to publish its data? This scenario, known as a data retention attack, requires specific protections in the smart contracts that connect the L2 to the underlying L1. While Ethereum rollups can easily prove the existence of EIP-4844 blobs, the process is more complex for L2s that use alternative DA layers, such as Eclipse with Celestia. Let's explore DA proofs and DA bridging.

What is DA bridging?

The DA layer’s value proposal is that it guarantees the availability of data for a given amount of time, either through algorithmic and/or economic guarantees. The question here is “how can I prove that this data truly exists on the DA layer, assuming that the DA layer is providing data as expected?”. There exist mechanisms to challenge a faulty behavior from the DA layer, but this is a topic for another time. Here we assume that the DA layer is behaving honestly, storing and providing the data as requested.

L2s that use alternative DA layers do not inherit Ethereum's economic security, but rather that of their chosen DA layer. Therefore, it is crucial for DA layers to have robust mechanisms protecting against malicious actors. These typically include consensus protocols, data availability sampling, fraud proofs, and slashing of misbehaving nodes. Together, these mechanisms ensure the DA layer fulfills its commitment to store and provide data, protecting against attackers up to a specific funding threshold.

When using EIP-4844 on Ethereum, rollups can directly check the existence of blobs. While EIP-4844 blobs are not directly readable from smart contracts, there exist links between the two that enable proving that a specific blob was posted and paid for, and therefore exists on the chain. The rollup can simply submit a blob and call the rollup smart contract in a single transaction, enabling the smart contract to verify that the blob exists.

There is no corresponding EVM primitive to prove the existence of a blob on Celestia from a smart contract on Ethereum. This is the purpose of DA bridging: DA providers build oracles that publish the state root of the DA layer to Ethereum, and provide tooling to open these commitments and verify the existence of blobs on the DA layer.

DA Bridging on Celestia

Celestia provides a DA bridging specification called Blobstream. There exist two main implementations of it, Blobstream0, maintained RISC Zero and SP1 Blobstream, maintained by Succinct. Both implementations work extremely similarly, so we will refer to either of them as Blobstream in the rest of the article.

Blobstream uses a relayer that listens to Celestia blocks, generates a ZK proof that each block was properly signed by a majority of Celestia consensus nodes, and builds a Merkle tree of (block height, block data root) tuples. The root of this Merkle tree, called the data root tuple root, is then sent to the Blobstream smart contract for storage.

The contract interface can then be used to verify that a specific Celestia block exists and is attested to by a majority of the Celestia nodes, using the verifyAttestation() method.

Celestia Proofs: I Hope You Like Merkle Trees

Celestia’s proof system is allegedly a little complicated, so let’s take a few paragraphs to clarify how it works.

/

The Celestia Merkle tree structure. In total, three nested trees are used. Source: Celestia.

We already talked a bit about the data root tuple tree, of which the root gets committed to Ethereum. Each leaf points to a single Celestia block. This block is itself made of multiple blobs, which are themselves made of one or more shares, 512-byte binary slices of data that make up the block. Shares are the smallest indivisible data unit on Celestia. Each share belongs to a single blob, which itself belongs to a user-defined namespace. Namespaces are used to sort shares by application, with some namespaces being reserved for system usage. For reference, Eclipse has its own namespace on Celestia.

Celestia uses erasure coding to ensure blobs can be retrieved from the network even if some shares become unavailable or corrupted. Erasure-coding imposes a two-level commitment to shares. In short, shares are laid out in a k*k matrix. This matrix is then extended into a 2k*2kmatrix named the Extended Data Square (EDS). For each column and each row, a Namespaced Merkle Tree (NMT) is built. This enables:

  1. Efficient identification of all shares in a namespace
  2. Efficient exclusion proofs for namespaces.

The 4k NMT roots are then mapped into a binary merkle tree to compute the data root of the block.

The Celestia Extended Data Square. Source: Celestia

We will not go into the details here, but the official documentation and L2Beat provide excellent primers on the Extended Data Square structure on which each block is built. An important point to note here is that Celestia transactions are also stored as shares inside the EDS.

Proving Data Availability

Now that we are up-to-date on Celestia’s commitment scheme, let’s take a look at how we can use it to prove that the rollup data is indeed available. Celestia provides two ways to identify a blob:

  1. As a sequence of spans, which identifies the blob by its position in the EDS, more specifically using a (block_height, start, n_shares) tuple.
  2. With a blob share commitment, i.e. a hash committing to the data itself. With this method, a blob is identified by a (block_height, blob_commitment) tuple.

The first method only describes where to find the data, while the second commits to the content of the blob. Both methods can be used to prove that the rollup did not post a blob, but the sequence of spans method wins in terms of simplicity.

Exclusion proofs for sequences of spans

Using the positional method, you only need to prove that the blob exists within the EDS for this block. Remember, we do not care about the content of the blob for DA proofs, we only care that the blob exists! This is a simple bound check that is trivial to implement: you only need to verify that start + n_shares >= square_size to prove exclusion. An important property of this method is that this operation is symmetrical to proving inclusion.

Exclusion proofs for blob share commitments

Before we address exclusion proofs, we must first explain how inclusion proofs work for the blob share commitment method. The Celestia RPC API provides the blob.GetProof method, which fetches NMT proofs for the blob in question. These proofs can then be used to validate that a) the shares exist and b) that they correspond to the share commitment, proving the existence of the blob. Note that the cost of verifying this proof grows with the size of the blob. A simple optimization here is to find the “Pay For Blob” (PFB) transaction/message that pays for the inclusion of the blob and then prove the inclusion of this message instead. PFB messages are guaranteed to span a maximum of two rows, meaning a bounded verification cost regardless of the size of the blob.

Exclusion proofs are a different story: the Merkle trees used in Celestia’s proof system do not support exclusion proofs. Instead, we need an exhaustive approach: we need a method to list all the blobs in the block and prove that our blob is not in the list.

We could implement this naïvely by downloading the entire block, but this is a bit wasteful. Rather, we know that all the blobs are listed in the PFB messages we mentioned earlier. The process looks like this:

  1. Download all the rows containing PFB messages (namespace 0x4) using the share.GetRow endpoint.
  2. Recompute the NMT roots for each row and prove inclusion of each root in the data root of the block.
  3. Deserialize the payload of each PFB message and check if the blob share commitment appears.

If your blob share commitment does not appear in any PFB message, congratulations you just proved that it does not exist!

Comparison

Pros Cons
Sequence of spans Inclusion and exclusion proofs are symmetric
Proofs do not require access to the blob data
None
Blob share commitment Commits to the data directly Proving exclusion requires an exhaustive proof
Requires a separate fraud proof game for exclusion proofs.

Beyond the difference in complexity, we can add two points in favor of the sequence of spans method:

  1. During a fraud proof submission, the challenger must prove inclusion of the blobs he is using, to prove that he is using the data committed by the L2. As the sequence of spans method does not require downloading the data, the availability/unavailability of the data can be proven at that stage.
  2. As the blob share commitment proofs are not symmetric, the L2 would need to support a separate fraud proof system to prove data availability frauds. This adds complexity.

Overall, the sequence of spans method should be preferred at this stage for optimistic rollups.

Optimizations for high throughput

The method we described above works well for single blobs, but in reality Eclipse posts many blobs to store a single batch. Eclipse posts up to 500 MB of data per hour to Celestia, in other words 500 1 MB blobs. Storing 500 blob commitments as L1 calldata per hour is an expensive proposition! Furthermore, these costs would grow linearly with the  throughput of the chain, meaning that more activity begets more L1 storage and more L1 costs.

We can easily avoid this problem by using an index blob that contains the sequences of spans for all the blobs of a given L2 block batch. This enables to post a single Celestia blob commitment to L1 independently of the size of the L2 block batch. This is as efficient as it gets, at the cost of just a bit more implementation logic to download the index blob, deserialize it and prove its availability.

Considering 20 bytes per blob commitment, and a safe estimate of 1 MB per blob, this gives us an upper limit of more than 50 GB per L2 batch, more than 100x what we need today. This easily enables 1M+ TPS, a clear target in our vision for the GSVM client. We can also add a second level of index blobs if we ever need to go beyond this.

The last problem to solve is how to authenticate the blocks using Blobstream. As we mentioned earlier, each block must be authenticated using the verifyAttestation() method. In the worst-case scenario, this would require one call per blob. At about 90k gas per call, this would require 45M gas for an hour-long batch, while the maximum today on Ethereum is 36M gas.

This puts an upper limit to our batch size, and we don’t like upper limits.

Feel the Steel

Thankfully there exists a ZK solution to this from RISC Zero called Steel. Steel is a tool that enables anyone to prove smart contract code execution off-chain, within the RISC Zero zkVM. We can use it to verify as many Blobstream attestations as we’d like off-chain at a fraction of the cost, and verify a single ZK proof on-chain. RISC Zero estimates proving to cost in the 10s of USDs and verification to cost around 300k gas. This corresponds to a 99% cost saving compared to our estimated 45M gas in the worst-case!

Steel enables us to replace a linear proving cost by a constant one and removes the upper limit on throughput that we would face by verifying Blobstream attestations on-chain. Even better, RISC Zero provides proving services in the form of Bonsai and the recently announced Boundless decentralized proving network, enabling the generation of these proofs without in-depth knowledge.

Conclusion

Through this blog, we highlighted the next steps for DA bridging for Eclipse, an important step towards operational fraud proofs.

While necessary for securing Eclipse, DA bridging has the potential to become a bottleneck that limits our maximum throughput because of gas limits and L1 costs, hindering our ambitions of GigaCompute. By combining the right blob commitment scheme, index blobs, and the approach of proving Blobstream attestations off-chain, we show that L1 costs can be reduced to the strict minimum, while enabling 1M+ TPS.

With the planned increase in Celestia’s block sizes to 1 GB, we anticipate being able to safely DA bridge with a throughput of up to 30M TPS.

This paves the way for Eclipse to achieve an exceptionally high throughput, while maintaining robust security guarantees for users.

Share this post