cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Wille <>
Subject Re: does consistency=ALL for deletes obviate the need for tombstones?
Date Tue, 16 Dec 2014 15:53:03 GMT
Tombstones have to be created. The SSTables are immutable, so the data cannot be deleted. Therefore,
a tombstone is required. The value you deleted will be physically removed during compaction.

My workload sounds similar to yours in some respects, and I was able to get C* working for
me. I have large chunks of data which I periodically replace. I write the new data, update
a reference, and then delete the old data. I designed my schema to be tombstone-friendly,
and C* works great. For some of my tables I am able to delete entire partitions. Because of
the reference that I updated, I never try to access the old data, and therefore the tombstones
for these partitions are never read. The old data simply has to wait for compaction. Other
tables require deleting records within partitions. These tombstones do get read, so there
are performance implications. I was able to design my schema so that no partition ever has
more than a few tombstones (one for each generation of deleted data, which is usually no more
than one).

Hope this helps.


On Dec 16, 2014, at 8:22 AM, Ian Rose <<>>

Howdy all,

Our use of cassandra unfortunately makes use of lots of deletes.  Yes, I know that C* is not
well suited to this kind of workload, but that's where we are, and before I go looking for
an entirely new data layer I would rather explore whether C* could be tuned to work well for

However, deletions are never driven by users in our app - deletions always occur by backend
processes to "clean up" data after it has been processed, and thus they do not need to be
100% available.  So this made me think, what if I did the following?

  *   gc_grace_seconds = 0, which ensures that tombstones are never created
  *   replication factor = 3
  *   for writes that are inserts, consistency = QUORUM, which ensures that writes can proceed
even if 1 replica is slow/down
  *   for deletes, consistency = ALL, which ensures that when we delete a record it disappears
entirely (no need for tombstones)
  *   for reads, consistency = QUORUM

Also, I should clarify that our data essentially append only, so I don't need to worry about
inconsistencies created by partial updates (e.g. value gets changed on one machine but not
another).  Sometimes there will be duplicate writes, but I think that should be fine since
the value is always identical.

Any red flags with this approach?  Has anyone tried it and have experiences to share?  Also,
I *think* that this means that I don't need to run repairs, which from an ops perspective
is great.

Thanks, as always,
- Ian

View raw message