incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: Strategy to delete/expire keys in cassandra
Date Tue, 23 Feb 2010 10:26:12 GMT
Hi,

Maybe the following ticket/patch may be what you are looking for:
https://issues.apache.org/jira/browse/CASSANDRA-699

It's flagged for 0.7 but as it breaks the API (and if I understand correctly
the release plan) it may not make it in cassandra before 0.8 (and the
patch will have to change to accommodate the change that will be
made to the internals in 0.7).

Anyway, what I can at least tell you is that I'm using the patch against
0.5 in a test cluster without problem so far.

> 3)      Once keys are deleted, do you have to wait till next GC to clean
> them from disk or memory (suppose you don’t run cleanup manually)? What’s
> the strategy for Cassandra to handle deleted items (notify other replica
> nodes, cleanup memory/disk, defrag/rebuild disk files, rebuild bloom filter
> etc). I’m asking this because if the keys refresh very fast (i.e., high
> volume write/read and expiration is kind of short) how will the data file
> grow and how does this impact the system performance.

Items are deleted only during compaction, and you may actually have to
wait for the GCGraceSeconds before deletion. This value is configurable in
storage-conf.xml, but is 10 days by default. You can decrease this value
but because of consistency (and the fact that you have to at least wait for
compaction to occurs) you will always have a delay before the actual delete
(all this is also true for the patch I mention above by the way). But when it's
deleted, it's just skipping the items during compaction, so it's really cheap.

--
Sylvain

Mime
View raw message