cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Wille <>
Subject Re: Delete-only work loads crash Cassandra
Date Wed, 15 Apr 2015 12:44:45 GMT
I can readily reproduce the bug, and filed a JIRA ticket:

I’m posting for posterity

On Apr 13, 2015, at 11:59 AM, Robert Wille <<>>

Unfortunately, I’ve switched email systems and don’t have my emails from that time period.
I did not file a Jira, and I don’t remember who made the patch for me or if he filed a Jira
on my behalf.

I vaguely recall seeing the fix in the Cassandra change logs, but I just went and read them
and I don’t see it. I’m probably remembering wrong.

My suspicion is that the original patch did not make it into the main branch, and I just have
always had enough concurrent writing to keep Cassandra happy.

Hopefully the author of the patch will read this and be able to chime in.

This issue is very reproducible. I’ll try to come up with some time to write a simple program
that illustrates the problem and file a Jira.



On Apr 13, 2015, at 10:39 AM, Philip Thompson <<>>

Did the original patch make it into upstream? That's unclear. If so, what was the JIRA #?
Have you filed a JIRA for the new problem?

On Mon, Apr 13, 2015 at 12:21 PM, Robert Wille <<>>
Back in 2.0.4 or 2.0.5 I ran into a problem with delete-only workloads. If I did lots of deletes
and no upserts, Cassandra would report that the memtable was 0 bytes because an accounting
error. The memtable would never flush and Cassandra would eventually die. Someone was kind
enough to create a patch, which seemed to have fixed the problem, but last night it reared
its ugly head.

I’m now running 2.0.14. I ran a cleanup process on my cluster (10 nodes, RF=3, CL=1). The
workload was pretty light, because this cleanup process is single-threaded and does everything
synchronously. It was performing 4 reads per second and about 3000 deletes per second. Over
the course of many hours, heap slowly grew on all nodes. CPU utilization also increased as
GC consumed an ever-increasing amount of time. Eventually a couple of nodes shed 3.5 GB of
their 7.5 GB. Other nodes weren’t so fortunate and started flapping due to 30 second GC

The workaround is pretty simple. This cleanup process can simply write a dummy record with
a TTL periodically so that Cassandra can flush its memtables and function properly. However,
I think this probably ought to be fixed. Delete-only workloads can’t be that rare. I can’t
be the only one that needs to go through and cleanup their tables.


View raw message