The Retrieval Market Working Group (RMWG) has been hard at work in 2022! This report summarizes its progress in the first half of 2022, as well as some insights into what to expect in the second half of 2022.

Reading Time: Grab a coffee and enjoy this roundup for 10 minutes!

Filecoin Network

We'll start with a high-level diagram of the Filecoin network, which will help explain the following.

picture


stored procedure

Starting at the top left, content publishers (aka storage clients) talk to transactional services like Estuary, NFT Storage, and Filmine to store data. These services transact with Storage Providers (SPs), which in turn add CIDs to indexer nodes. This process is fully operational.

Search process

For the retrieval process, the retrieval client contacts the retrieval provider (RP) to get some data. Returns the RP if it has data in its cache. Otherwise, when we do a reliable retrieval from the SP, it either cache misses to the SP or current to the IPFS gateway. This process is under development.

RMWG themes

Considering the network graph, we can look at each topic for the first half of 2022.

picture

Topic 1: Retrieving Provider Nodes

The retrieval network cannot be started without building a retrieval provider node (RP). There are several different teams building RP in 2022.

First, Myel builds the Myel PoP (Point of Presence). The team has built the Myel PoP in Golang until 2022. In H1 2022, they rewrote it in Rust to provide better security while developing and to provide compatibility with WASM for browser compatibility. The Rust version of Myel PoP is not open sourced yet, but it will be open source soon.

In the first half of 2022, Protocol Labs begins work on the Saturn network . The network has two levels of RP, L1 cache and L2 cache. The L1 cache node is the entry point for retrieval clients to the Saturn retrieval network . It uses Nginx under the hood. L2 cache nodes are located in the next cache tier after L1. L2 is designed to run on home computers in home networks, reducing hardware requirements to join the wider Filecoin network. L2 is built with libp2p and written in Golang.

Two other teams have also been working on RP in 2022. Titan RP from New Web Group, about to be open sourced, FCR node from WCGCYX, open source in the last week of the first half of 2022.

More information on these networks is below.

Topic 2: Cryptoeconomics

The cryptoeconomics of retrieval is a huge topic, and we have made incremental progress in the first half of 2022. Essentially, this workflow aims to answer the following questions:

What incentivises a Retrieval Provider to join the Filecoin Network?

Customer pays for retrieval directly

The easiest way to answer this question is to force retrieval clients to pay RP directly on each retrieval. At the right price, this will bring the RP to market.

Myel envisions a retrieval network based on the assumption that retrieval customers will always pay directly for retrieval.

picture

This approach makes sense in many situations, especially server-to-server retrieval. Some of these examples are

  • payment index provider,

  • pay reputation providers,

  • L1 cache pays for L2 cache,

  • Potential web3 browser retrieval (but this requires a paradigm shift in our current browser usage).

This direct payment method also has some advantages over third-party subsidized searches. First, each retrieval is a local exchange between two entities: an RC and an RP. This means that, at the end of the exchange, both parties receive what they want without further arbitration or bookkeeping. Second, the financial cost of transacting with the RC prevents the RC from launching a grief, Sybil, or DDOS attack on the RP.

Customers do not pay directly for retrieval

When customers don't pay for retrieval directly, we have to narrow down to the entire network architecture and figure out where the payment might come from.

picture

In this diagram, the green line represents the payment flow, and the white line represents the data flow. If the RC does not directly pay per retrieval, the only other entity that might pay for the data accelerated by the RP is the content publisher.

Therefore, any attempt to incentivize RPs to join the network must find a mechanism to allow payments to flow from content publishers to RPs.

Several teams are developing different solutions for this scenario where the customer does not pay directly for retrieval:

Saturn

picture

In the Saturn network, each RP self-reports its retrieval to the Saturn coordinator. The Saturn Orchestrator then aggregates these logs and rewards them based on the contribution of each RP.

In the first half of 2022, Saturn launched a private mainnet and is collecting these retrieval logs. In the second half of 2022, Saturn will unveil the mainnet and determine the amount of spend associated with each RP's contribution.

The obvious problem with self-report is that it opens up some attack vectors.

picture

As you can see in the image above, the RP can collude with the Retrieval Client, which can fraudulently create thousands of "fake" retrievals to increase the log count. Likewise, the RP can send more "fake" logs to the logging endpoint. Or a more deliberate attack might start thousands of RCs and make requests to the RPs they manage.

These attack vectors are well known; the Saturn team will work with CryptoEconLab in the second half of 2022 to develop a fraud detection module to analyze Saturn retrieval logs.

titan

The New Web Group is developing a retrieval network called Titan Ultra. The team chose a different approach than Saturn to demonstrate contributions in the network. In the Titan network, there are validator nodes that perform retrieval tests against RPs and report those tests back to the (initially) centralized orchestrator.

picture

This makes it harder for RPs to cheat in the network as they need to continue to serve well in case they are tested by validators. This approach to network measurement is similar to that employed by the Storage Metrics DAO envisioned by Meson Network , Media Network , Theta , and CryptoNetLab .

The Titan Network conducted initial research grants between January 2022 and April 2022, and follow-up grants are now underway to deploy a PoC version of the Titan Ultra Network, which will land in Q3 2022.

From proof to payment

Regardless of how RPs demonstrate their contributions to the network, all networks face the challenge of how to manage spending on RPs based on those attestations. We will see progress on this step in the second half of 2022.

 Watch this video  and  this video  for a discussion on cryptoeconomics of retrieval.

Search fixed

In addition to payment systems and market creation, a penalty system for missed retrievals is another way to incentivize RPs to provide reliable service. CryptoNet 's Retrieval Pinning project implements such a system. Two key elements are smart contracts and the referee network. Smart contracts allow customers and suppliers to agree on a "retrievability transaction" for a given CID. After the transaction is signed, the collateral from the provider is locked in the contract. Delegates can retrieve documents from providers and activate smart contracts and "cut" them for bad service (i.e. provider loses collateral).

Topic 3: Payment Channels

The first half of 2022 saw great progress in payment channel workflows. Magmo has been working on building go-nitro , a client for multi-hop payment channels in the Filecoin network. Magmo completed the initial grant at the end of June 2022 . In July 2022, Magmo will begin follow-up grants to produce go-nitro and join the FVM Foundry to begin work on on-chain components to go-nitro.

picture

In short, go-nitro focuses on moving from the graph on the left to the graph on the right. On the left, when an RC wants to acquire from an SP, they must establish a paired payment channel, which is an on-chain transaction. This means that we have a large number of payment channels, and each new SP that the RC wants to get data from needs an additional payment channel.

On the right, we envision a setup where RCs and SPs use their favorite "Hop Hub" (aka Payment Channel Provider) to create a one-time payment channel. Then, these hop hubs all have payment channels between each other. With this setup, we can establish an off-chain virtual channel between RC and SP. After making this virtual payment, we can coordinate the payment by shifting the payment around the three mortgage payment channels. This greatly reduces the number of on-chain payment channels and results in no on-chain transactions until payment is made to the new SP.

In addition, WCGCYX continues to work on proxy payment and retrieval network FCR. The idea here is that if the RP has no files, it can recursively ask its neighbors for files, and so on. Once the file is found, it can be returned to the customer, and payment can then be brokered using payment channels between all intermediary providers, each collecting a small fee in the process. We're looking for a team to do the great work that FCR has already started.

Topic 4: Reputation System

In April 2022, Ken Labs completed a grant to build Pando , an off-chain verifiable data store for network data and metadata.

Ken Labs completed its first grant in March 2022, with immediate follow-up grants to integrate Pando with services like Dealbot, Filecoin Green, Auto-retrieve, and more. This follow-on grant will also allow Ken Labs to build the monitoring system and web UI for Pando. Subsequent grants will run through September 2022.

Furthermore, reputation is closely related to the cryptoeconomic topic above, where we describe how RPs can demonstrate their network contributions to coordinators or validators. Data related to each of these retrieval tests can be used to build reputations around SPs and RPs. CryptoNetLab is working on this through their Retrievability Oracle program.

Topic 5: Indexing

indexer

In March 2022, the Protocol Lab Data Systems team delivered the indexer . The indexer stores a map of CIDs to which the SP stores them. It has been able to scale to billions of records.

In the RMWG, both Leeway Hertz and Ken Labs are running the Indexer node and are exploring other things they can build around the indexer related to tooling and testing.

Content index

Between January 2022 and March 2022, ChainSafe is working on a research grant to study content indexing for the Filecoin network. Although progress has been made and documented , the decision to index content is premature and we should wait until retrieval from RP or SP is more performant and reliable.

Topic 6: Data transfer and transfer protocols

Between January 2022 and March 2022, the Myel team is working on a grant to build JS-graphsync . Between April 2022 and June 2022, the Myel team then worked on the grant build of rust-graphsync. This is still closed source, but will be open source soon. Creating Graphsync in these two languages ​​provides a key building block for JS and Rust IPFS and Filecoin stacks.

Between April 2022 and June 2022, ChainSafe is working on a WebRTC research grant to determine how the WebRTC protocol set will work across different browsers. They are writing their findings to share by the end of June 2022.

In addition, both Titan and Myel benchmark retrieval from providers behind NATs in the home network. In both cases, the team found that performance was suboptimal and that multithreaded retrieval was probably the best way forward.

Topic 7: Browser Retrieval

The Saturn and Myel teams are the two teams that will spend the most time thinking about browser retrieval in the first half of 2022.

Between March 2022 and June 2022, the Saturn team built a service worker to provide incremental validation of CAR files. This is because the browser must be able to verify the files it retrieves from the decentralized web, since it has no implicit trust in the server it gets the data from when it initially connects.

During the first half of 2022, the Myel team has been working on running Myel POP nodes in service workers as well as browser extensions. They rewrote the Myel PoP node in Rust for browser compatibility when compiled to WASM.

Topic 8: Network Monitoring

From February 2022 to June 2022, Leeway Hertz has been working on the Web3 CDN Comparison Dashboard . The team also wrote an interpreter for the dashboard . Work on this dashboard continues in both directions.

  1. Bring more Web3 CDNs to the dashboard.

  2. Deploy more retrieval robots to retrieve from around the world.

Leeway Hertz also built a dashboard for Saturn to help the team monitor the performance of the network. The team is also now considering building a dashboard to display retrieval performance from the SP.

Learn more about RMWG

You can find more information about the work of the RMWG in several places.

Between February 2022 and April 2022, Onda Studio has developed a new website for RMWG  https://retrieval.market . On this page you can find links to all teams, projects and available opportunities.

Another good place to find information is the Retrieval Markets Notion folder , which we've been updating weekly since early 2022.

You can also find recordings of RM Demo Days on our Youtube channel , with new content every few weeks in the first half of 2022.

We're always looking for more teams to get involved, so get in touch if interested.

Looking forward to the second half of 2022!