TL;DR: The Retrieval Market Working Group (RMWG) has been hard at work in 2022! This report summarizes progress in the first half of 2022 and provides some insight into what to expect in the second half of 2022.

Filecoin Network

We'll start with a high-level diagram of the Filecoin network, which will help explain the following.

picture


stored procedure

Starting at the top left, content publishers (aka storage clients) communicate with transaction services like Estuary, NFT Storage, and Filmine to store data. These services transact with Storage Providers (SPs), which in turn add CIDs to indexer nodes. This flow is fully functional.

Search process

For retrieval streams, the retrieval client contacts the retrieval provider (RP) to get some data. Returns that data if the RP has data in its cache. Otherwise, it will cache misses to SP, or current to IPFS gateway, and we do reliable retrieval from SP. This process is under development.

RMWG themes

Considering the network graph, we can look at each topic in the first half of 2022.

picture

Topic 1: Retrieving Provider Nodes

The retrieval network cannot be started without building a retrieval provider node (RP). In 2022, there are several different teams building RP.

First, Myel builds the Myel PoP (Point of Presence). Before 2022, the team has established Myel PoP in Golang. In the first half of 2022, they rewrote it in Rust to provide better security while developing and to provide compatibility with WASM for browser compatibility. The Rust version of Myel PoP is not open sourced yet, but it will be open source soon.

In the first half of 2022, Protocol Labs begins work on the Saturn network. This network has two levels of RP, L1 cache and L2 cache. The L1 cache node is the entry point for retrieval clients into the Saturn retrieval network. It uses Nginx under the hood. L2 cache nodes sit at the next cache layer after L1, which is designed to run on home computers in a home network, reducing the hardware requirements to join the wider Filecoin network. L2 is built with libp2p and written in Golang.

Two other teams are also working on RP in 2022. Titan RP from New Web Group will be open source soon, as well as the FCR node from WCGCYX, which is open sourced in the last week of the first half of 2022.

See below for more information on these networks.

Topic 2: Cryptoeconomics

The cryptoeconomics of retrieval is a huge topic, and we have made incremental progress in the first half of 2022. Essentially, this workflow is designed to answer the following questions:

What incentivises a Retrieval Provider to join the Filecoin Network?

Customer pays for retrieval directly

The easiest way to answer this question is to force retrieval clients to pay RP directly per retrieval. At the right price, this will bring the RP to market.

Myel envisions a retrieval network with the assumption that retrieval clients will always pay directly for retrieval.

picture

There are many situations where this approach makes sense, especially server-to-server retrievals. Some examples of these are

  • pay the index provider,

  • pay reputable providers,

  • L1 cache that pays L2 cache,

  • Potential web3 browser retrieval (but this requires a paradigm shift from our current browser usage).

This direct payment method also has some advantages over third-party subsidized retrievals. First, each retrieval is a local exchange between two entities: RC and RP. This means that, at the end of the exchange, both parties receive what they want without further arbitration or bookkeeping. Second, the financial cost of transacting with RC prevents grief, Sybil or DDOS attacks from RC to RP.

Clients do not pay directly for retrieval

When customers don't pay for retrieval directly, we have to narrow down to the entire network architecture and figure out where the payment might come from.

picture

In this diagram, the green line represents the payment flow, and the white line represents the data flow. If the RC does not directly pay per retrieval, the only other entity that might pay for the data accelerated by the RP is the content publisher.

Therefore, any attempt to incentivize RPs to join the network must find a mechanism for payments to flow from content publishers to RPs.

Some teams are developing different solutions for this situation where the client does not directly pay for retrieval:

Saturn

picture

In the Saturn network, each RP reports its retrievals to the Saturn Orchestrator. The Saturn Orchestrator then aggregates these logs and rewards each RP based on its contribution.

In the first half of 2022, Saturn launched a private mainnet and is collecting these retrieval logs. In the second half of 2022, Saturn will open the mainnet and determine the amount of spend associated with each RP's contribution.

The obvious problem with self-report is that it opens up some attack vectors.

picture

As shown in the diagram above, RPs can collude with retrieval clients, which can fraudulently create thousands of "fake" retrievals to increase the log count. Likewise, the RP can send more "fake" logs to the log endpoint. Alternatively, a more deliberate attack might launch thousands of RCs and make requests to the RPs they manage.

These attack vectors are well known. The Saturn team will work with CryptoEconLab in the second half of 2022 to develop a fraud detection module to analyze Saturn retrieval logs.

giant

The New Web Group is developing a retrieval network called Titan Ultra. The team chose a different approach than Saturn's to demonstrate contributions in the network. In the Titan network, there are validator nodes that perform retrieval tests on RPs and report these tests to the (initially) centralized coordinator.

picture

In this way, RPs are more difficult to cheat in the network as they need to continue to provide good service in case validators test it. This approach to network measurement is similar to that employed by Meson Network, Media Network, Theta, and the DAO for storage metrics envisioned by CryptoNetLab.

The Titan Network conducted an initial research grant between January and April 2022, and a follow-on grant is now underway to deploy a PoC version of the Titan Ultra Network, which will land in Q3 2022.

Proof to Payment

Regardless of how RPs prove their contributions to the network, all networks face the challenge of how to manage payments to RPs based on those proofs. We will see progress on this step in the second half of 2022.

Search fixed

In addition to payment systems and market creation, systems that penalize missed retrievals are another way to incentivize RPs to provide reliable service. The retrieval fixed item of CryptoNet implements such a system. Two key elements are smart contracts and the referee network. Smart contracts allow customers and providers to agree on a "retrievable" for a given CID. When the transaction is signed, the collateral from the provider is locked in the contract. The trusted referee retrieves the files from the provider, and activates the smart contracts and "slashes" them for bad service (i.e. the provider's collateral is lost).

Topic 3: Payment Channels

The first half of 2022 saw great progress in payment channel workflows. Magmo has been working on a grant to build go-nitro, a client of multi-hop payment channels in the Filecoin network. Magmo completed the initial grant at the end of June 2022. In July 2022, Magmo will begin follow-up grants to produce go-nitro and join the FVM Foundry to begin work on components on the go-nitro chain.

picture

In short, the point of go-nitro is to move from the graph on the left to the graph on the right. On the left, when the RC wants to get from the SP, they have to set up a paired payment channel, which is an on-chain transaction. This means that we have a large number of payment channels, and each new SP that the RC wants to get data from needs an additional payment channel.

On the right, we envision a setup where RCs and SPs use their favorite "Hop Hub" (aka Payment Channel Provider) to create a one-time payment channel. Then, the hopping hubs have payment channels between each other. With this setup, we are able to establish an off-chain virtual channel between RC and SP. Once this virtual payment is done, we can coordinate the payment by moving it around the three mortgage payment channels. This greatly reduces the number of on-chain payment channels and results in no on-chain transactions until the new SP is paid.

Additionally, WCGCYX continues to work on FCR, a proxy payment and retrieval network. The idea here is that if the RP doesn't have a file, it can ask its neighbors for that file, and so on. When documents are found, the documents can be returned to the customer, and then the payment channels between all intermediary providers can be used to proxy the payment, with each provider charging a small fee along the way. We are looking for a team to take on the great work FCR has already started.

Topic 4: Reputation System

In April 2022, Ken Labs completed a grant to build Pando, an off-chain verifiable data store for network data and metadata.

Ken Labs completed its first grant in March 2022 and immediately moved on to follow-up grants to integrate Pando with services like Dealbot, Filecoin Green, Auto-retrieve, and more. This follow-on grant will also see Ken Labs build a monitoring system and web UI for Pando. Subsequent grants will run through September 2022.

Furthermore, reputation is closely related to the cryptoeconomic topic above, and we describe how RPs can demonstrate their network contributions to coordinators or validators. Data about each of these retrieval tests can be used to form reputations about SPs and RPs. CryptoNetLab is working on this through their Retrievability Oracle program.

Topic 5: Indexing

indexer

In March 2022, the Protocol Lab Data Systems team released the indexer. The indexer storage SP stores a map of these IDs. It has been able to scale to billions of records.

In the RMWG, both Leeway Hertz and Ken Labs are running an indexer node and are exploring additional tooling and testing related content built around the indexer.

Content index

Between January and March 2022, ChainSafe is working on research to fund the content indexing of the Filecoin network. Although progress has been made and written, it is too early to decide on content indexing, we should wait until retrieval from RP or SP is more performant and reliable.

Topic 6: Data transfer and transfer protocols

Between January and March 2022, the Myel team received a grant to build JS-graphsync. Between April and June 2022, the Myel team then worked on building grants for rust-graphsync. This is still closed source, but will be open source soon. Creating Graphsync in these two languages ​​provides a key building block for JS and Rust IPFS and Filecoin stacks.

Between April and June 2022, ChainSafe participated in a WebRTC research grant to determine how well the WebRTC protocol suite works across different browsers. They are writing up their findings to share by the end of June 2022.

Additionally, both Titan and Myel benchmark the retrieval of providers behind NATs in home networks. In both cases, the team found that performance was suboptimal and that multithreaded retrieval was probably the best way forward.

Topic 7: Browser Retrieval

Teams Saturn and Myel are the two teams that spend the most time thinking about browser retrieval in the first half of 2022.

Between March and June 2022, the Saturn team built a service worker to provide incremental validation of CAR files. This is because the browser must be able to verify the files it retrieves from the decentralized network, since it has no implicit trust in the server it is getting data from when it initially connects.

Throughout the first half of 2022, the Myel team has been working on running Myel POP nodes in service workers as well as browser extensions. They rewrote their Myel PoP node in Rust to have browser compatibility after compiling to WASM.

Topic 8: Network Monitoring

Leeway Hertz has been developing a Web3 CDN comparison dashboard between February and June 2022. The team also wrote an interpreter for the dashboard. Work on this dashboard continues in both directions.

  1. Bring more Web3 CDNs to the dashboard.

  2. Deploy more retrieval robots that retrieve from different locations around the globe.

Leeway Hertz also built a dashboard for Saturn to help the team monitor the performance of the network. The team is also now considering building a dashboard to show the retrieval performance of SPs.

Learn more about RMWG

In some places you can find more information about the work of the RMWG.

From February to April 2022, Onda Studio has developed a new website for RMWG https://retrieval.market. On this page you can find links to all teams, projects and available opportunities.

Another good place to find information is to retrieve the Market Concepts folder, which we have been updating weekly since early 2022.

You can also find RM Demo Day recordings on our Youtube channel, with new content every few weeks throughout the first half of 2022.

We're always looking for more teams to get involved, so get in touch if you're interested.

Looking forward to the second half of 2022!

picture

Scan code to communicate