From Jeff Jirsa <>
Subject Re: Need help on dealing with Cassandra robustness and zombie data
Date Mon, 01 Jul 2019 14:08:48 GMT
What you’re describing is likely impossible to do in cassandra the way you’re thinking

The only practical way to do it is extending gcgs and making the tombstone reads less expensive
(ordering the clustering columns so you’re not scanning the tombstones, or breaking the
partitions into buckets so you can cap the tombstones per partition), or using a strongly
consistent database (I’d probably use something like sharded MySQL or similar) 

> On Jul 1, 2019, at 6:45 AM, yuping wang <> wrote:
> Thank you; very helpful.
> But we do have some difficulties
> #1 Cassandra process itself didn’t go down when marked as “DN”... (the node itself
might just be temporary having some hiccup and not reachable )... so would not auto-start
still help?
> #2 we can’t set longer gc grace because we are very sensitive to latency ... and we
have a lot data in and data out... so we can’t afford keep that large tombstone
> #3 the question what is the reliable way to detect change of node status? We tried to
use a crontab job to poll nodestatus every 5 minutes... but we still end up missing some change
of status especially if the node is bouncing up and down... also by the time we detect and
try to replace node permanently, we might already exceeded that grace period.
> Thanks again,
> Yuping 
> On Jul 1, 2019, at 9:02 AM, Rhys Campbell <>
> #1 Set the cassandra service to not auto-start.
> #2 Longer gc_grace time would help
> #3 Rebootstrap?
> If the node doesn't come back within gc_grace,_seconds, remove the node, wipe it, and
bootstrap it again.
> yuping wang <> schrieb am Mo., 1. Juli 2019, 13:33:
>> Hi all,
>>   Sorry for the interruption. But I need help.
>>    Due to specific reasons of our use case,  we have gc grace on the order of 10
minutes instead of default 10 days. Since we have a large amount of nodes in our Cassandra
fleet, not surprisingly, we encounter occasionally  node status going from up to down and
up again. The problem is when the down node rejoins the cluster after 15 minutes, it automatically
adds already deleted data back and causing zombie data.
>> our questions:
>> Is there a way to not allow a down node to rejoin the cluster?
>> or is there a way to configure rejoining node not adding stale data back regardless
of how long the node is down before rejoining
>> or is there a way to auto clean up the data when rejoining ?
>> We know adding those data back is a conservative approach to avoid data loss but
in our specific case, we are not worried about deleted data being revived.... we don’t have
such use case. We really need a non-defaul option to never add back deleted data on rejoining
>> this functionality will ultimately be a deciding factor on whether we can continue
with Cassandra.
>> Thanks again,

