What Does Safety Over Liveness Really Mean?

November 8, 2022

Proof-of-Stake (PoS) today is the most important consensus mechanism for new blockchains. One of the most compelling features is protocol staking. Token holders have the opportunity to participate in these blockchains (i.e., PoS networks) by delegating tokens in exchange for rewards to provide security. However, participants have to be aware of the risks associated with protocol staking. When participants are ready to stake tokens and earn rewards, they must pick a validator. The process of selecting a validator should not be taken lightly as validators have different levels of trust, transparency, security, and performance.

The Safety over Liveness Approach

The “safety over liveness” approach is unique and effective to gain trust from token holders. Non-custodial staking providers like Figment adopt this approach to prevent slashing risks and potential losses of delegator tokens by optimizing uptime instead of maximizing it. In other words, being offline is better than having a double signing incident, as safety is preferred to liveness during turbulent events. As a matter of fact, the network uptime metric is potentially misleading to participants and newcomers. Stakers may think the higher the uptime, the higher the chances of generating rewards. However, given the different characteristics of each PoS network, the rewards performance strategy is nuanced and unique for each one. For example, having 99% uptime instead of 95% uptime on a PoS network like Ethereum is not necessarily better if it significantly increases the risk of slashing for a marginal increase in expected rewards. Furthermore, it is almost impossible to have more than 95% uptime on Solana because it requires that all other validators be online as well. The cost of liveness (i.e., maximizing uptime) is also dangerous for staking providers and token delegators because the potential penalties are greater than the cost of downtime, which is none to a certain point.

Staking providers that are experts in slashing mitigation and prevention tend to have little to no slashing incidents compared to other staking providers that focus solely on maximizing uptime, which has led to slashing incidents and loss of token holder funds.

Our data reiterates the importance of doing due diligence before selecting a validator, as 213 slashing events have occurred since May of this year.

Applying The Safety over Liveness Approach During The Merge

In general, staking providers reputed for maximizing safety for staking have a deep and diverse expertise and turnkey staking offering. Through a coordinated team effort spanning customer success, engineering, legal, protocol, and security, staking providers like Figment offer the best staking experience by emphasizing safety over liveness. When the Merge occurred, all staking providers made decisive operational decisions around the activation of MEV rewards (see Figment’s MEV Policy: Supporting Neutral, Secure, and Open Solutions). The implementation of MEV-Boost was quite complex and had dependencies that imposed risks on staking providers’ infrastructure. There was always the possibility that it would introduce critical vulnerabilities and result in the loss of staking rewards.

In the case of Figment, we waited for a few hours and then slowly rolled out MEV-Boost once we were confident that the Merge was successful. A true demonstration of “safety over liveness”!

Sustainable and Secure Practices

Over time, it is inevitable that slashing will happen among all staking providers, despite having the best practices, given all of the complexities and events that cannot be anticipated. Now that Ethereum has become a proof-of-stake blockchain, the minimization of slashing risk is more important than ever. Figment has adopted Web3Signer, a remote signing solution, to ensure that we minimise the likelihood of signing two blocks simultaneously (i.e., double signing).

Furthermore, our fail-over process from detection to resolution, if a slashing were to occur, is the following:

  • Figment employs a 24/7 incident response process, which utilizes internal and external monitoring.
  • For internal monitoring, servers run performance monitoring programs (e.g., Prometheus) to produce logs, which are scraped and monitored continuously by platforms (e.g., Datadog). Alerts of various priority levels are triggered based on predefined conditions and delivered to the operations team who are on-call through multiple platforms (e.g., Slack and PagerDuty).
  • For external monitoring, external APIs for on-chain data are scraped and monitored continuously by platforms for redefined conditions that would be indicative of an issue with a validator (e.g., as a change in validator status to “slashed” or a decrease in a validator wallet balance over a period of time), and alerts of various priority levels would be triggered and delivered to on-call team members. As Figment employs a “safety over liveness” approach when resolving issues, fail-over plans are manually, not automatically, initiated. Once the issue is resolved, a retrospective is completed, containing a root cause analysis and short-term and long-term remediation suggestions.

What to Remember

Slashing events are inevitable. Therefore, staking providers have to develop the best practices to minimize downtime when a slashing event occurs. This comes down to monitoring and having deep expertise in PoS networks. It’s not about being perfect, but about being the best. To learn more about Figment’s validator performance, take a look at our most recent Ethereum validator performance report here.

SHARE POST

Meet with us

Bring the Complete Staking Solution to Your Organization

Figment respects your privacy. By submitting this form, you are acknowledging that you have read and agree to our Privacy Policy, which details how we collect and use your information.