On July 1, starting at 15:54 CET, a network malfunction disrupted connectivity for a portion of our bare‑metal infrastructure. As a result, message delivery (email, push, SMS) and several APIs suffered partial outage or degraded performance. The disruption persisted for roughly 2 hours, with full restoration by approximately 18:00 CET.
The incident was caused by two redundant network switches that failed to recover correctly after a power issue. Both switches had a firmware defect that prevented synchronization, leading to network partition across affected racks. The manufacturer has since identified and fixed the firmware bug, which is no longer in operation.
In coordination with our hosting provider, we restored network connectivity and progressively brought all impacted systems back online. By approximately 20:54 CET, service levels had fully normalized. No customer data has been lost during the incident, though data ingested through our public APIs and SDKs during the window may not have been processed if not retried.
To improve resilience, we are enhancing our infrastructure monitoring to detect hardware-level network issues faster and with greater precision. We’re also continuously improving our recovery tools and incident response processes to streamline restoration during such events.
We sincerely apologize for the disruption and appreciate the trust you place in us. If you have any questions or would like further clarification, please feel free to reach out.