Research Report · CDT-RR-2026-01

The Annual State of Cloud Reliability

Our flagship review of how the three major hyperscalers performed in 2026 — measured uptime, where downtime came from, how fast each recovered, and what an hour of outage now costs the business.

Published: June 2026
Coverage: AWS · Azure · GCP
Incidents analysed: 147
Pages: 32

01 · Executive summary

2026 was the most reliable year on record for the major clouds — and also the most expensive to be offline. Blended customer-impacting downtime across AWS, Azure and Google Cloud fell 14% year-over-year, yet the median cost of an outage minute rose sharply as workloads grew more real-time and interdependent.

Google Cloud led the field on every availability metric we track, helped by aggressive progressive-rollout tooling. Azure posted the highest downtime of the three, driven by a cluster of identity and authentication incidents. AWS sat in between — improving steadily, but still carrying concentration risk in us-east-1.

−14%

Aggregate downtime, YoY

Blended customer-impacting downtime across the three hyperscalers fell for the second consecutive year, driven by faster automated rollback.

31%

Of outages were network-rooted

Networking and DNS remain the single largest root-cause category, ahead of power, hardware, and software deploys.

74 min

Blended mean time to recovery

MTTR improved 9% year-over-year, but the gap between the fastest and slowest provider widened to 25 minutes.

2.3×

Cost of unplanned downtime

Median per-minute cost of cloud downtime for mid-market firms rose to $5,600, up sharply on tighter real-time dependencies.

02 · Reliability positioning

We plot each provider on demonstrated reliability against operational maturity. Bubble size reflects market presence. All three qualify as Leaders or Strong Performers — the gaps are in execution, not capability.

GCP — LeadersReliability 89 · Maturity 82 · Presence 62
AWS — LeadersReliability 83 · Maturity 88 · Presence 95
Azure — Strong PerformersReliability 66 · Maturity 71 · Presence 84

03 · Where downtime comes from

Across all three providers, networking and DNS remained the single largest root-cause category — a pattern that has held for three years running. Software deploys are the fastest-shrinking category as progressive rollout and automated rollback mature.

Networking / DNS31%
Power & Hardware22%
Software Deploys19%
Capacity / Scaling16%
Configuration12%

“The reliability conversation has shifted from ‘will it go down’ to ‘how fast can it come back’ — and that is where the providers now compete.”
Outlook for 2027 · CloudDowntime Research

Methodology

Findings draw on 147 customer-impacting incidents recorded across AWS, Azure and Google Cloud between January and June 2026, normalised by tracked-service footprint. Downtime is counted only where customer impact was independently observed. Second-half figures are projected from trailing run-rate and labelled as such. Figures in this preview are illustrative placeholders pending the full dataset.

Get the full 32-page report

Per-provider deep dives, regional breakdowns, and the downtime cost model.

Download the report