Whilst conducting a routine maintenance (cleaning some stale data) on our Cassandra database cluster, a part of one of our clusters started to consume a large amount of memory which led to a partial collapse.
At first, we didn't suspected this simple database operation was the cause, and we focused on getting the cluster back on its feet. A further investigation revealed that this outage was clearly due to human intervention.
During this outage :