Hi Jonathan,

Thanks for your response.

We were running a compact at least once a day over the keyspace.  The gc_grace was set to only 1 hour, so from what you said I would expect that tombstones should be deleted after max 3 days.
When I inspected the data in the SSTables after a compact, some rows contained millions of tombstones with many having timestamps indicating they were older than 2 weeks.

We have recently migrated to a new schema design that avoids deleting columns or rows.
I ran another compact once data was not being added to the new keyspace (it only ever added new columns, never modified existing or deleted columns).  That compact deleted all of the existing tombstones, reducing our data from ~250G down to ~30G.
I assume there must have been something strange in our keyspace that prevented tombstones from being deleted just while data was being added.

We longer delete columns so the issue is no longer critical for us, but I am still curious as to what/why the issue was occurring just in case we start deleting columns again ;-)

Thanks,
Ross



On 4 April 2012 09:10, Jonathan Ellis <jbellis@gmail.com> wrote:
Removing expired columns actually requires two compaction passes: one
to turn the expired column into a tombstone; one to remove the
tombstone after gc_grace_seconds. (See
https://issues.apache.org/jira/browse/CASSANDRA-1537.)

Perhaps CASSANDRA-2786 was causing things to (erroneously) be cleaned
up early enough that this helped you out in 0.8.2?

On Wed, Mar 21, 2012 at 8:38 PM, Ross Black <ross.w.black@gmail.com> wrote:
> Hi,
>
> We recently moved from 0.8.2 to 1.0.8 and the behaviour seems to have
> changed so that tombstones are now not being deleted.
>
> Our application continually adds and removes columns from Cassandra.  We
> have set a short gc_grace time (3600) since our application would
> automatically delete zombies if they appear.
> Under 0.8.2, the tombstones remained at a relatively constant number.
> Under 1.0.8, the tombstones have been continually increasing so that they
> exceed the size of our real data (at this stage we have over 100G of
> tombstones).
> Even after running a full compact the new compacted SSTable contains a
> massive number of tombstones, many that are several weeks old.
>
> Have I missed some new configuration option to allow deletion of tombstones?
>
> I also noticed that one of the changes between 0.8.2 and 1.0.8 was
> https://issues.apache.org/jira/browse/CASSANDRA-2786 which changed code to
> "avoid dropping tombstones when they might still be needed to shadow data in
> another sstable".
> Could this be having an impact since we continually add and remove columns
> even while a major compact is executing?
>
>
> Thanks,
> Ross
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com