incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Mullins <smull...@thebrighttag.com>
Subject Re: Strange row expiration behavior
Date Wed, 24 Oct 2012 14:02:23 GMT
That worked perfectly, inserting another row after the first compaction,
then flushing and compacting again triggered the empty rows to be removed.
Thanks for your help and for clarifying the "gcBefore" point Aaron.

Stephen

On Tue, Oct 23, 2012 at 4:47 PM, aaron morton <aaron@thelastpickle.com>wrote:

> In the first example, I am running compaction at step 7 through nodetool,
>
> Sorry missed that.
>
>
>>    1. insert a couple rows with ttl=5 (again, just a small number)
>>    2.
>>
>> ExpiringColumn's are only purged if their TTL has expired AND their
> absolute (node local) expiry time occurred before the current "gcBefore"
> time.
> This may have explained why the columns were not purged in the first
> compaction.
>
> Can you try your first steps again. And then for the second set of steps
> add a new row, flush, compact. The expired rows should be removed.
>
> I don't have to manually delete empty rows after the columns expire. .
>
> Rows are automatically purged when all columns are purged.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 24/10/2012, at 3:05 AM, Stephen Mullins <smullins@thebrighttag.com>
> wrote:
>
> Thanks Aaron, my reply is inline below:
>
> On Tue, Oct 23, 2012 at 2:38 AM, aaron morton <aaron@thelastpickle.com>wrote:
>
>> Performing these steps results in the rows still being present using *cassandra-cli
>> list*.
>>
>> I assume you are saying the row key is listed without any columns. aka a
>> ghost row.
>>
> Correct.
>
>>
>>  What gets really odd is if I add these steps it works
>>
>> That's working as designed.
>>
>> gc_grace_seconds does not specify when tombstones must be purged, rather
>> it specifies the minimum duration the tombstone must be stored. It's really
>> saying "if you compact this column X seconds after the delete you can purge
>> the tombstone".
>>
>> Minor / automatic compaction will kick in if there are (by default) 4
>> SSTables of the same size. And will only purge tombstones if all fragments
>> of the row exists in the SSTables being compaction.
>>
>> Major / manual compaction compacts all the sstables, and so purges the
>> tombstones IF gc_grace_seconds has expired.
>>
>> In your first example compaction had not run so the tombstones stayed on
>> disk. In the second the major compaction purged expired tombstones.
>>
> In the first example, I am running compaction at step 7 through nodetool,
> after gc_grace_seconds has expired. Additionally, if I do not perform the
> manual delete of the row in the second example, the ghost rows are not
> cleaned up. I want to know that in our production environment, I don't have
> to manually delete empty rows after the columns expire. But I can't get an
> example working to that effect.
>
>>
>> Hope that helps.
>>
>>   -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 23/10/2012, at 2:49 PM, Stephen Mullins <smullins@thebrighttag.com>
>> wrote:
>>
>> Hello, I'm seeing Cassandra behavior that I can't explain, on v1.0.12.
>> I'm trying to test removing rows after all columns have expired. I've read
>> the following:
>> http://wiki.apache.org/cassandra/DistributedDeletes
>> http://wiki.apache.org/cassandra/MemtableSSTable
>> https://issues.apache.org/jira/browse/CASSANDRA-2795
>>
>> And came up with a test to demonstrate the empty row removal that does
>> the following:
>>
>>    1. create a keyspace
>>    2. create a column family with gc_seconds=10 (arbitrary small number)
>>    3. insert a couple rows with ttl=5 (again, just a small number)
>>    4. use nodetool to flush the column family
>>    5. sleep >10 seconds
>>    6. ensure the columns are removed with *cassandra-cli list *
>>    7. use nodetool to compact the keyspace
>>
>> Performing these steps results in the rows still being present using *cassandra-cli
>> list*. What gets really odd is if I add these steps it works:
>>
>>    1. sleep 5 seconds
>>    2. use cassandra-cli to *del mycf[arow]*
>>    3. use nodetool to flush the column family
>>    4. use nodetool to compact the keyspace
>>
>> I don't understand why the first set of steps (1-7) don't work to remove
>> the empty row, nor do I understand why the explicit row delete somehow
>> makes this work. I have all this in a script that I could attach if that's
>> appropriate. Is there something wrong with the steps that I have?
>>
>> Thanks,
>> Stephen
>>
>>
>>
>
>

Mime
View raw message