Blockchain Oracles

Artemis week-7

Aug 26, 2022

We are almost there, with only 2 more weeks ahead of us. This will most likely be the last week with a dedicated journal post. Next time I will either do a recap/review of the course or go over my course project (maybe I'm inspired and I can even ship 2 articles in 1 week).

Same trend as last week. We continued working on our course projects and we also had "crypto talks", this time we learned about front-end development tailored to the web3 needs, and some entrepreneurship talks.

As per the regular lectures, this week we learned how blockchain oracles work and some of the alternatives most broadly used in the industry.

The need for oracles

Blockchains are closed distributed systems. As we have previously seen, this means that each network node has to deterministically find the same output for a given input. Some of the most valuable blockchain properties that arise from these designs are its decentralization and mitigation of network downtime, the trustable consensus validation of transactions, and the prevention of double-spending.

Nevertheless, all these features have some tradeoffs. Since smart contracts are unable to perform calls to external APIs, the network cannot natively access external data. On top of that, it is impossible to generate on-chain cryptographically secure random numbers within smart contracts.

As you can imagine, oracles are the answer to these problems. Blockchain oracles are third-party services that bring real-world, off-chain information into the network by verifying and relaying the data. Since they bridge the on-chain and the off-chain world, they are known as blockchain middleware. Oracles solve this problem by posting the data on the blockchain. So any Ethereum node replaying the transaction will use the same immutable data that's posted for all to see. To do this, an oracle is typically made up of a smart contract and some off-chain components that can query APIs, then periodically send transactions to update the smart contract's data.

But fetching off-chain data is just the first part of the problem. Ideally, these oracles should be resistant to manipulation, provide accurate data, and be continuously updated with the latest information. As you can imagine, these properties cannot be ensured by a single (or centralized) oracle provider, since it could end up being hacked, outdated, or disconnected.

Anatomy of a decentralized oracle

High-level, a decentralized oracle is a network of independent nodes that provide on-chain data. Where the data of each individual node is aggregated and deterministically validated by the network.

To ensure data accuracy and try to prevent data manipulation, oracles networks should have a decentralized approach in each of its layers:

Data source: An oracle is only as secure as its data sources. To ensure information accuracy, data is fetched from multiple, and independent, data sources. The key to having accurate price data (one of the most common applications for oracles) is to have full market coverage to prevent data manipulation or inefficiencies due to low volumes.
Node operator: As explained before, any decentralized oracle network should have several independent nodes fetching data. If node operators source price data from multiple independent data sources, they can take statistically representative measures such as the median to mitigate outliers and API downtime.
Oracle network: An oracle network defines how the collection of nodes works together to create a single on-chain reference data point. The most common form of aggregation is taking the median of the reported values once a minimum number of nodes have responded to the request.

Chainlink decentralized price feed network

Architecture

For the network to fetch data, the interested parties (clients) will have to trigger a data request to the oracle network. The chain of action usually works as follows:

1. A client calls the oracle smart contract to create a new data request.

2. The oracle contract emits a log with a new event request.

3. All the off-chain services that are subscribed to these logs (usually using something like the JSON-RPC eth_subscribe command) receive the event.

4. The off-chain services proceed to do some tasks as defined by the log.

5. The off-chain services respond with the data requested in a secondary transaction to the smart contract.

From an oracle-user point of view, the implementation of oracles into their contracts is extremely easy and convenient. In fact, the most challenging part is the risk assessment behind the oracle that the user decides to choose. The following code snipped showcases how to create a request to the Chainlink price feed smart contract.

How to fetch data from Chainlink’s price feeds.

TWAP Oracles

Despite the previous definition that we have seen for blockchain oracles, one can understand a price oracle as any tool used to fetch price information about a given asset. With this definition in mind, we can see another subset of oracles that fully leaves on-chain. We will call them on-chain oracles.

Time-Weighted Average Price (TWAP) Oracles were developed by Uniswap, and still are the most adopted form of on-chain oracles.

As we have previously before, an oracle needs to ensure the quality of the provided data and be manipulation resistant. In order to get the latest spot price of an asset, one could just use the liquidity of an AMM pool as a price oracle by simply dividing the number of tokens currently residing within each side of the pool to get an exchange rate. Despite this approach would give us the best data freshness, it can easily be manipulated with the usage of tools such as flash loans.

So instead of just using the latest price, one can sacrifice some accuracy to increase manipulation resistance by using several data points instead of just one.

Limitations of the TWAP

Despite TWAP oracles being useful and convenient, they have some limitations when compared to oracles that can fetch off-chain data such as Chainlink.

Freshness/Accuracy: By design, a TWAP is a lagging indicator that provides an average price taken over a period of time. While this approach is fine in periods of short volatility, it can become a significant issue during times of considering volatility because the TWAP desynchronizes from the real price. As previously explained, this manipulation-resistant approach is inversely correlated with data freshness, meaning safety needs to be prioritized over accuracy.
Market Coverage: Another critical issue of TWAP oracles is that they are restricted to a single DEX. Note that this lack of market coverage makes it easier (it is cheaper since the liquidity of a DEX is lower than the overall market liquidity) for malicious actors to tamper with the price of a pool.
Feed Diversity: Finally, and related to the previous point, since TWAP oracles are limited to the pools of a DEX, they can only provide coverage for on-chain assets (that's why we called them on-chain oracles).

So it can be concluded that TWAP oracles are better suited for low volatility assets with high liquidity.

Resources

If you are interested in oracles and want to do a deep dive into how they work, I recommend going over the following resources:

Aspiring web3 dev

Discussion about this post