On the morning of October 20, 2025, millions of users encountered the same message: “Connection error.”
The culprit wasn’t their router, but Amazon Web Services (AWS). An incident in its infrastructure caused a global outage, taking thousands of websites and applications offline.
What happened and why it had such an impact
According to Amazon’s official record on the AWS Health Dashboard, the disruption originated in the US-EAST-1 region (Virginia), the most used by clients worldwide. A combination of internal network issues and failures in critical service resolution triggered cascading errors affecting authentication, APIs, databases, and load balancing.
The effects were tangible for productivity platforms, e-commerce, streaming, and payment services.
To understand operational dependence on AWS, it’s helpful to see how companies rely on Amazon’s managed services beyond computing. As noted in this Emprender y Más analysis on AWS AI for SMEs, the cloud not only hosts applications but orchestrates supply chains, demand, and routes. When this layer fails, the impact multiplies.
Concentration and fragility
The incident reignites the debate on infrastructure concentration. Three hyperscalers (AWS, Microsoft Azure, and Google Cloud) handle most of the global traffic and computing capacity. This efficiency comes with systemic risk: a single point of failure can affect thousands of businesses simultaneously.
In Europe, the push for technological sovereignty seeks to mitigate this. Initiatives like GAIA-X promote federated, interoperable data and cloud service frameworks so resilience doesn’t depend on a single provider or region. The goal isn’t to abandon hyperscalers but to balance architecture with more distributed models.
Redsys, the other alert of the day
On the same October 20, Spain experienced a card payment blackout due to an incident with Redsys, the major POS transaction processor. The company clarified that its outage was unrelated to AWS, but the timing served as a stark reminder: the digital economy depends on a few critical nodes, whether cloud or financial.
For Spanish businesses, this dual event underscores the need to manage third-party risk and ensure business continuity. Our analysis on regulation and resilience in the financial sector addresses this framework (DORA, ICT and supplier management): Navigating Digital Banking: Regulation and New Risks.
Impact on companies and startups: from outage to costs
Operationally, an outage like this results in downtime, lost sales, SLA degradation, and reputational stress. Cloud-native startups—without robust contingency plans—are especially vulnerable.
The paradox: the cloud accelerates growth, but an architecture without redundancies can amplify risks when failures occur.
The pattern from these events is clear: companies with multi-region or multicloud architectures, data replication, and feature flags for controlled degradation suffer less and recover faster.
Practical lessons to reduce risk
-
Avoid provider monoculture. Design for vendor diversity (multicloud/hybrid) and, at minimum, multi-AZ/multi-region within the same provider.
-
Separate critical planes. Keep identity, monitoring, and backups in distinct domains/zones; test actual restorations, not just scripts.
-
Smart functional degradation. Maintain a “minimal viable” service mode if non-essential components fail (e.g., offline order queues).
-
End-to-end observability. Distributed traces, SLO metrics, and actionable alerts. Third-party tools help, but define your own runbooks.
-
Governance and compliance. In regulated sectors, map critical dependencies and continuity tests; European regulations increasingly require this (see DORA approach above).
To dive deeper into edge computing and near-data processing, see the in-house dossier: offloading workloads to the edge reduces latency and single points of failure.
A more distributed Internet is a more resilient business
The lesson from October 20, 2025, is not “avoid the cloud” but mature your architecture: diversify, isolate failures, and design for recovery. Hyperscalers will remain the backbone of the digital ecosystem, but resilience requires complementing their scale with interoperability, federation, and redundancy.
The good news for entrepreneurs and CTOs is that these capabilities are increasingly accessible—and documented—by the providers themselves. The challenge is no longer technological: it’s about priorities and governance.