Customer Business Impact
From approximately 20210609 13:08 UTC - 13:40 UTC all MindTouch sites experienced increased latency and error rates.
Problem Summary
A configuration change to a backend service caused more events to be picked up for processing than expected. This service put extra load on the database which used up resources and backed up incoming requests.
Recovery:
Once identified, the backend services was scaled down to reduce the load to the database.
Affected Mindtouch Services:
Corrective Actions:
Temporarily reduced number of backend services running simultaneously
Using a the read-copy of the database to not share database processing load with general traffic
Other corrective actions have been identified but need additional research