SecurID Service Incident (EMEA Region)
Incident Report for SecurID
Postmortem

On July 21, 2022, authentication services hosted in the EU region were degraded for 1 hour and 40 minutes (from 12:32 AM ET to 02:12 AM ET). All Cloud Authentication Service (CAS) authentications in the EU region were impacted during this incident. The incident occurred due to a failure with our cloud service provider affecting our EU West deployment. SecurID CAS systems in this region lost connectivity to backend Database services hosted by our cloud provider. Our cloud service provider (Microsoft) has identified that the failure was the result of an operator error which led to an incorrect action being performed. Details from Microsoft can be found here.

 During the outage SecurID Operations team enacted our incident and disaster response plans. Following our DR procedures, we began the failover process to our secondary deployment hosted in EU North while simultaneously working to restore services in the existing primary site in EU West. Microsoft was able to mitigate the incident in EU West, which restored the Authentication service, before we could complete the DR failover procedures.

Mitigations

SecurID is continuously taking steps to improve the SecurID Access service and processes to help ensure such incidents do not occur in the future. This includes, but is not limited to:

  • SecurID Engineering teams as of 8/3/2022 have finished deploying Active-Warm to our EMEA and ANZ regions. Our new Active-Warm architecture replicates all changes from region site to region site.  In the event of an outage in one site, tenant traffic can be directed to the other region site, greatly minimizing the amount of time needed to perform a region failover and thus minimizing the impact of an outage to end users. Active-Warm rollout our to our North American region is on track to be completed August of 2022.

We apologize for any inconvenience.

Posted Aug 05, 2022 - 14:47 UTC

Resolved
After monitoring the fix, SaaS Operations has determined that the incident affecting SecurID has been resolved.

We will post a root cause analysis as soon as it is available.
Posted Jul 21, 2022 - 07:22 UTC
Monitoring
The issue affecting SecurID has been corrected. The SaaS Operations team is working with our cloud service provider to monitoring the fix.

We will post a root cause analysis as soon as it is available.
Posted Jul 21, 2022 - 06:24 UTC
Update
SaaS Operations has identified the cause of the outage related to a service outage with our cloud provider. We are currently working on implementing a fix.
Posted Jul 21, 2022 - 05:41 UTC
Identified
We have detected an issue affecting SecurID.
SaaS Operations is investigating the issue and will post updates as they become available.
Posted Jul 21, 2022 - 04:58 UTC
This incident affected: EMEA (access-eu Authentication Service, eu2.access-eu Authentication Service).