incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Tarbox <tar...@cabotresearch.com>
Subject Re: can I kill very old data files in my data folder (I know that sounds crazy but....)
Date Wed, 18 Jun 2014 19:05:42 GMT
Rob,
Thank you!   We are not using TTL, we're manually deleting data more than 5
days old for this CF.  We're running 1.2.13 and are using size tiered
compaction (this cf is append-only i.e.zero updates).

Sounds like we can get away with doing a (stop, delete old-data-file,
restart) process on a rolling basis if I understand you.

Thanks,

Brian


On Wed, Jun 18, 2014 at 2:37 PM, Robert Coli <rcoli@eventbrite.com> wrote:

> On Wed, Jun 18, 2014 at 10:56 AM, Brian Tarbox <tarbox@cabotresearch.com>
> wrote:
>
>> I have a column family that only stores the last 5 days worth of some
>> data...and yet I have files in the data directory for this CF that are 3
>> weeks old.
>>
>
> Are you using TTL? If so :
>
> https://issues.apache.org/jira/browse/CASSANDRA-6654
>
> Are you using size tiered or level compaction?
>
> I have six bunches of these file groups, each with a different nnnn
>> value...and with timestamps of each of the last five days...plus one group
>> from 3 weeks ago...which makes me wonder if that group  somehow should have
>> been deleted but were not.
>>
>> The files are tens or hundreds of gigs so deleting would be good, unless
>> its really bad!
>>
>
> Data files can't be deleted from the data dir with Cassandra running, but
> it should be fine (if probably technically unsupported) to delete them with
> Cassandra stopped. In most cases you don't want to do so, because you might
> un-mask deleted rows or cause unexpected consistency characteristics.
>
> In your case, you know that no data in files created 3 weeks old can
> possibly have any value, so it is safe to delete them.
>
> =Rob
>
>

Mime
View raw message