Date: April 26, 2024
Status: Resolved
On April 26th, 2024, our service experienced a downtime event due to an SSL certificate failure, affecting most of our customers from approximately 2:20 AM to 3:00 AM MST. The issue stemmed from recent changes in the rules around SSL certificate auto-renewal by our SSL provider, which were not fully accounted for in our system. As a result, we have enhanced our alerting systems and updated our infrastructure configurations to prevent similar issues in the future.
The downtime affected all users attempting to access our site without a cached version of the SSL certificate, resulting in about 40 minutes of service disruption.
The primary cause of the downtime was a change in the auto-renewal rules for SSL certificates by our provider, which led to an unexpected, silent expiration. Although a fix was previously developed, it was not yet fully propagated across the entire Limble infrastructure.
Immediate actions were taken to renew the SSL certificate and mitigate the situation. Following the incident, we fully integrated the auto-renewal fix and enhanced our monitoring and alerting capabilities to detect similar issues ahead of potential disruptions.