Illumination
Infura Autopsy
Surgeon's report from reputation.link - 2020-11-17
Here are the facts. At 8:12 UTC on the 11th Nov 2020, Infura (a hosted Ethereum node cluster that allows users to run applications without requiring them to set up their own Ethereum node) suffered an ETH1 mainnet API service degradation.
The result of this ETH1 mainnet API service outage was the significant disruption of Ethereum-based services for some Infura users. In an official communique, Infura stated:
“We are currently experiencing a service outage for our Ethereum Mainnet API. Our on-call team is investigating and working to restore service functionality. We will post updates here as we have them.”
According to the official Infura press release, the outage lasted for 6.5 hours and normal services resumed at 14:42 UTC.
Though the technical details behind the outage are outside the scope of this article, it is suffice to say there was an Ethereum upgrade (essentially an "unannounced hard fork") that caught some users unaware.
Now we know the why, let's take a look at how the team at reputation.link dissects these types of events.
From a reputation.link perspective it is exactly these unplanned, highly disruptive events that give our users (both node operators and entities looking to employ node operators) the ability to objectively assess any node’s performance during times of stress.
So what did happen to the Chainlink network during the Infura outage? Did any of the Chainlink feeds go offline? Did any of the node operators suffer down time?
In short, no. Nothing untoward happened to the network, all the price feeds reported data as normal, and all verified node operators remained online. Infura-like events (and their associated consequences) are exactly what reputation.link analytics can track, monitor, and ultimately report on. And it is this valuable information that forms an important component of our reputation system.
Essentially, we ask the question; how did each node fair when the going got tough? We look at the data, analyse it, and arrange it into an easy to understand format. Then we tell the world. It turns out that on this occasion, the node operators and the Chainlink network performed fantastically well.
How does reputation.link assess a node’s performance during these adverse events? We provide a number of methods. At the highest level we can quickly view graphical representations of key metrics such as average response times and transaction history. This information can immediately verify whether a node has suffered downtime.
For example, no transactions or abnormal response times typically mean there’s been a problem. Taking the Fiews node example in the graph below, (incidentally Fiews offer Ethereum as a service and did not suffer an outage) it is clear to see that their average response time continued with no significant change during the 6.5 hours of the Infura outage. The same goes for their transaction graphs. Essentially, there was no significant deviation from the established norm. Their node continued to function as intended.
On a more nuanced level, the team also compiles and examines transactional data from individual price feeds. Again, this information allows users to critically analyse the response of nodes during times of difficulty. Deviations from the norm indicate a problem. Once a node issue is observed, the root cause of the problem can be identified and fixed.
Even within mature ecosystems like Ethereum, outages may occur. The Infura issue once again highlights the inherent danger of relying on a centralised Ethereum node provider. During the outage, the Chainlink network and its node operators showed incredible resilience. The node operators demonstrated the benefits of running dedicated Ethereum nodes, of employing various failsafes, and of having competent teams dedicated to the smooth running of said nodes. Close monitoring of the entire Infura event enabled reputation.link to definitively prove that the verified Chainlink node operators continue to live up to their excellent reputation garnered over the past year.
For more updates from reputation.link, subscribe to the newsletter. You can also join the Discord to get involved and talk with the team!