The Annual State of Cloud Reliability
Our flagship review of how the three major hyperscalers performed in 2026 — measured uptime, where downtime came from, how fast each recovered, and what an hour of outage now costs the business.
- Published
- June 2026
- Coverage
- AWS · Azure · GCP
- Incidents analysed
- 147
- Pages
- 32
01 · Executive summary
2026 was the most reliable year on record for the major clouds — and also the most expensive to be offline. Blended customer-impacting downtime across AWS, Azure and Google Cloud fell 14% year-over-year, yet the median cost of an outage minute rose sharply as workloads grew more real-time and interdependent.
Google Cloud led the field on every availability metric we track, helped by aggressive progressive-rollout tooling. Azure posted the highest downtime of the three, driven by a cluster of identity and authentication incidents. AWS sat in between — improving steadily, but still carrying concentration risk in us-east-1.
−14%
Aggregate downtime, YoY
Blended customer-impacting downtime across the three hyperscalers fell for the second consecutive year, driven by faster automated rollback.
31%
Of outages were network-rooted
Networking and DNS remain the single largest root-cause category, ahead of power, hardware, and software deploys.
74 min
Blended mean time to recovery
MTTR improved 9% year-over-year, but the gap between the fastest and slowest provider widened to 25 minutes.
2.3×
Cost of unplanned downtime
Median per-minute cost of cloud downtime for mid-market firms rose to $5,600, up sharply on tighter real-time dependencies.
02 · Reliability positioning
We plot each provider on demonstrated reliability against operational maturity. Bubble size reflects market presence. All three qualify as Leaders or Strong Performers — the gaps are in execution, not capability.
- GCP — LeadersReliability 89 · Maturity 82 · Presence 62
- AWS — LeadersReliability 83 · Maturity 88 · Presence 95
- Azure — Strong PerformersReliability 66 · Maturity 71 · Presence 84
03 · Where downtime comes from
Across all three providers, networking and DNS remained the single largest root-cause category — a pattern that has held for three years running. Software deploys are the fastest-shrinking category as progressive rollout and automated rollback mature.
- Networking / DNS31%
- Power & Hardware22%
- Software Deploys19%
- Capacity / Scaling16%
- Configuration12%
“The reliability conversation has shifted from ‘will it go down’ to ‘how fast can it come back’ — and that is where the providers now compete.”
Methodology
Findings draw on 147 customer-impacting incidents recorded across AWS, Azure and Google Cloud between January and June 2026, normalised by tracked-service footprint. Downtime is counted only where customer impact was independently observed. Second-half figures are projected from trailing run-rate and labelled as such. Figures in this preview are illustrative placeholders pending the full dataset.
Get the full 32-page report
Per-provider deep dives, regional breakdowns, and the downtime cost model.