incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Redmumba <redmu...@gmail.com>
Subject Re: Customized Compaction Strategy: Dev Questions
Date Wed, 04 Jun 2014 17:10:38 GMT
Thanks, Russell--yes, a similar concept, just applied to sstables.  I'm
assuming this would require changes to both major compactions, and probably
GC (to remove the old tables), but since I'm not super-familiar with the C*
internals, I wanted to make sure it was feasible with the current toolset
before I actually dived in and started tinkering.

Andrew


On Wed, Jun 4, 2014 at 10:04 AM, Russell Bradberry <rbradberry@gmail.com>
wrote:

> hmm, I see. So something similar to Capped Collections in MongoDB.
>
>
>
> On June 4, 2014 at 1:03:46 PM, Redmumba (redmumba@gmail.com) wrote:
>
>  Not quite; if I'm at say 90% disk usage, I'd like to drop the oldest
> sstable rather than simply run out of space.
>
> The problem with using TTLs is that I have to try and guess how much data
> is being put in--since this is auditing data, the usage can vary wildly
> depending on time of year, verbosity of auditing, etc..  I'd like to
> maximize the disk space--not optimize the cleanup process.
>
> Andrew
>
>
> On Wed, Jun 4, 2014 at 9:47 AM, Russell Bradberry <rbradberry@gmail.com>
> wrote:
>
>>  You mean this:
>>
>>  https://issues.apache.org/jira/browse/CASSANDRA-5228
>>
>>  ?
>>
>>
>>
>> On June 4, 2014 at 12:42:33 PM, Redmumba (redmumba@gmail.com) wrote:
>>
>>   Good morning!
>>
>> I've asked (and seen other people ask) about the ability to drop old
>> sstables, basically creating a FIFO-like clean-up process.  Since we're
>> using Cassandra as an auditing system, this is particularly appealing to us
>> because it means we can maximize the amount of auditing data we can keep
>> while still allowing Cassandra to clear old data automatically.
>>
>> My idea is this: perform compaction based on the range of dates available
>> in the sstable (or just metadata about when it was created).  For example,
>> a major compaction could create a combined sstable per day--so that, say,
>> 60 days of data after a major compaction would contain 60 sstables.
>>
>> My question then is, will this be possible by simply implementing a
>> separate AbstractCompactionStrategy?  Does this sound feasilble at all?
>> Based on the implementation of Size and Leveled strategies, it looks like I
>> would have the ability to control what and how things get compacted, but I
>> wanted to verify before putting time into it.
>>
>> Thank you so much for your time!
>>
>> Andrew
>>
>>
>

Mime
View raw message