cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Strange row expiration behavior
Date Tue, 23 Oct 2012 07:38:19 GMT
> Performing these steps results in the rows still being present using cassandra-cli list.

I assume you are saying the row key is listed without any columns. aka a ghost row. 

>  What gets really odd is if I add these steps it works
That's working as designed. 

gc_grace_seconds does not specify when tombstones must be purged, rather it specifies the
minimum duration the tombstone must be stored. It's really saying "if you compact this column
X seconds after the delete you can purge the tombstone".

Minor / automatic compaction will kick in if there are (by default) 4 SSTables of the same
size. And will only purge tombstones if all fragments of the row exists in the SSTables being
compaction. 

Major / manual compaction compacts all the sstables, and so purges the tombstones IF gc_grace_seconds
has expired. 

In your first example compaction had not run so the tombstones stayed on disk. In the second
the major compaction purged expired tombstones. 

Hope that helps. 
  
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/10/2012, at 2:49 PM, Stephen Mullins <smullins@thebrighttag.com> wrote:

> Hello, I'm seeing Cassandra behavior that I can't explain, on v1.0.12. I'm trying to
test removing rows after all columns have expired. I've read the following:
> http://wiki.apache.org/cassandra/DistributedDeletes
> http://wiki.apache.org/cassandra/MemtableSSTable
> https://issues.apache.org/jira/browse/CASSANDRA-2795
> 
> And came up with a test to demonstrate the empty row removal that does the following:
> create a keyspace
> create a column family with gc_seconds=10 (arbitrary small number)
> insert a couple rows with ttl=5 (again, just a small number)
> use nodetool to flush the column family
> sleep >10 seconds
> ensure the columns are removed with cassandra-cli list 
> use nodetool to compact the keyspace
> Performing these steps results in the rows still being present using cassandra-cli list.
What gets really odd is if I add these steps it works:
> sleep 5 seconds
> use cassandra-cli to del mycf[arow]
> use nodetool to flush the column family
> use nodetool to compact the keyspace
> I don't understand why the first set of steps (1-7) don't work to remove the empty row,
nor do I understand why the explicit row delete somehow makes this work. I have all this in
a script that I could attach if that's appropriate. Is there something wrong with the steps
that I have?
> 
> Thanks,
> Stephen


Mime
View raw message