RSA SecurID Access Service Incident (NA Region) NA2 Auth and NA3 Auth

Incident Report for RSA ID Plus

Postmortem

The incidents that occurred on Feb 2nd and Feb 16th are related to the same set of conditions that resulted in degraded authentication services for some customers hosted from NA2 and NA3. When creating new nodes in preparation for scheduled cloud maintenance, we reconfigured our load balancer pools to add the new nodes. This caused temporary connection problems to the existing front-end nodes and high CPU utilization on all front-end nodes for a period of about 30 minutes. This high CPU utilization resulted in slow response times, which caused authentication failures.

Mitigations

RSA is continuously taking steps to improve the RSA SecurID Access service and processes to help ensure such incidents do not occur in the future. This includes, but is not limited to:

RSA has added additional capacity to front-end infrastructure to mitigate the observed connection problems and to avoid high CPU utilization on front-end nodes.
RSA has also adjusted the time when these actions are performed during our cloud maintenance preparation to make sure they are done in off-peak hours when system load is typically lowest.

Posted Mar 19, 2021 - 21:15 UTC

Resolved

RSA SaaS Operations has identified and resolved an incident affecting RSA SecurID Access in the na2.access and na3.access authentication services.

Posted Feb 03, 2021 - 02:00 UTC