Degraded Auth/Statistics/Transaction Service due to high latencies from Cassandra in Rugby environment
Likely affected endpoints:
/api/v1/transactions/...
/connector/users/{id}/transactions
/api/v1/statistics/query
/api/v1/budgets
/api/v1/insights
/api/v1/insights/action
Duration:
The issue started at around 04:10 CET on Jan 5th and ended at 08:25 CET on the same day.
Posted Jan 14, 2022 - 10:53 CET
Resolved
We're back to normal. The culprit was a spike in deleting/updating some Cassandra depending services. The issue started at around 4 AM and caused a certain node to go down. A restoration resolved the situation.
Also the Transaction Service was affected by the issue from 04:10 to 08:25 CET, with a partial error rate on the affected APIs, with highest degradation (up to 10% failure rate) from 07:50 to 08:15 CET.
Posted Jan 05, 2022 - 08:31 CET
Investigating
Mostly affected is the production environment for the Royal Bank of Scotland. We've already discovered the culprit for this issue and took action to fix it.