incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Deleting old items
Date Wed, 13 Feb 2013 09:14:59 GMT
Hi Aaron, once again thanks for this answer.

"So is it possible to delete all the data inserted in some CF between 2
dates or data older than 1 month ?"

"No. "

Why is there no way of deleting or getting data using the internal
timestamp stored alongside of any inserted column (as described here:
http://www.datastax.com/docs/1.1/ddl/column_family#standard-columns) ? Is
that a feature that could possibly be developed one day ? It could
be useful to perform delete of old data or to bring to a dev cluster just
the last week of data for example.

With "min_compaction_level_threshold" did you mean "min_compaction_threshold"
 ? If so, why should I do that, what are the advantage/inconvenient of
reducing this value ?

Looking at the doc I saw that: "max_compaction_threshold: Ignored in
Cassandra 1.1 and later.". How to ensure that I'll always keep a small
amount of SSTables then ? Why is this deprecated ?

Alain


2013/2/12 aaron morton <aaron@thelastpickle.com>

> So is it possible to delete all the data inserted in some CF between 2
> dates or data older than 1 month ?
>
> No.
>
> You need to issue row level deletes. If you don't know the row key you'll
> need to do range scans to locate them.
>
> If you are deleting parts of wide rows consider reducing the
> min_compaction_level_threshold on the CF to 2
>
> Cheers
>
>
>    -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 12/02/2013, at 4:21 AM, Alain RODRIGUEZ <arodrime@gmail.com> wrote:
>
> Hi,
>
> I would like to know if there is a way to delete old/unused data easily ?
>
> I know about TTL but there are 2 limitations of TTL:
>
> - AFAIK, there is no TTL on counter columns
> - TTL need to be defined at write time, so it's too late for data already
> inserted.
>
> I also could use a standard "delete" but it seems inappropriate for such a
> massive.
>
> In some cases, I don't know the row key and would like to delete all the
> rows starting by, let's say, "1050#..."
>
> Even better, I understood that columns are always inserted in C* with
> (name, value, timestamp). So is it possible to delete all the data inserted
> in some CF between 2 dates or data older than 1 month ?
>
> Alain
>
>
>

Mime
View raw message