MindTouch degraded performance: Sites with MindTouch.us domains are unavailable
Incident Report for MindTouch
Postmortem

Root Cause Analysis:

On August 8th, 2019, at approximately 7:30am (PT), customer sites with *.mindtouch.us domain were showing a 504 error, and pages on those sites were not loading.

Customer Business Impact:

Any customer sites with a *.mindtouch.us domain were unusable until this issue was resolved, approximately around 8:45am (PT).

Problem Summary:

On August 8th, the automated release process began creating the infrastructure for the new release. New load balancers were launched to replace the previous ones. A code defect caused the tool which generates our load balancer configuration files to fail.

Recovery:

The defect in the current version of the code was identified and a prior, stable version was deployed to replace the faulty one.

Root Cause Summary:

During the weekly Product Release an issue with the system kept new load balancers from being able to route site traffic. The issue only affected routing for sites ending with “.mindtouch.us” domains. This load balancer configuration update was not a part of the automated deployment process causing the issue to surface on the day of the release, instead of being found during prior testing.

Corrective Actions:

The tool used to update our load balancer configuration has been moved into the automated deployment process, to ensure it is updated with each deployment and tested by QA. Tests for the affected components will be improved.

Posted 6 days ago. Aug 13, 2019 - 19:50 UTC

Resolved
The MindTouch Engineering team has received confirmation from customers that the issue has been resolved. A root cause analysis will be provided here in the near future. Please standby for further updates on the root cause analysis.
Posted 11 days ago. Aug 08, 2019 - 21:37 UTC
Monitoring
The issue has subsided and the MindTouch Engineering team will continue to monitor the situation to confirm the issue has been resolved.
Posted 11 days ago. Aug 08, 2019 - 15:51 UTC
Investigating
The MindTouch Engineering team is actively investigating reports of service interruption on sites that have a Mindtouch.us domain.
Posted 12 days ago. Aug 08, 2019 - 14:45 UTC
This incident affected: Application (General Service).