incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Deleting old items
Date Sat, 16 Feb 2013 17:39:59 GMT
>  Is that a feature that could possibly be developed one day ?
No. 
Timestamps are essentially internal implementation used to resolve different values for the
same column. 

> With "min_compaction_level_threshold" did you mean "min_compaction_threshold"  ? If so,
why should I do that, what are the advantage/inconvenient of reducing this value ?
Yes, min_compaction_threshold, my bad. 
If you have a wide row and delete a lot of values you will end up with a lot of tombstones.
These may dramatically reduce the read performance until they are purged. Reducing the compaction
threshold makes compaction happen more frequently. 

> Looking at the doc I saw that: "max_compaction_threshold: Ignored in Cassandra 1.1 and
later.". How to ensure that I'll always keep a small amount of SSTables then ?
AFAIK it's not. 
There may be some confusion about the location of the settings in CLI vs CQL. 
Can you point to the docs. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/02/2013, at 10:14 PM, Alain RODRIGUEZ <arodrime@gmail.com> wrote:

> Hi Aaron, once again thanks for this answer.
>> "So is it possible to delete all the data inserted in some CF between 2 dates or
data older than 1 month ?"
> "No. "
> 
> Why is there no way of deleting or getting data using the internal timestamp stored alongside
of any inserted column (as described here: http://www.datastax.com/docs/1.1/ddl/column_family#standard-columns)
? Is that a feature that could possibly be developed one day ? It could be useful to perform
delete of old data or to bring to a dev cluster just the last week of data for example.
> 
> With "min_compaction_level_threshold" did you mean "min_compaction_threshold"  ? If so,
why should I do that, what are the advantage/inconvenient of reducing this value ?
> 
> Looking at the doc I saw that: "max_compaction_threshold: Ignored in Cassandra 1.1 and
later.". How to ensure that I'll always keep a small amount of SSTables then ? Why is this
deprecated ?
> 
> Alain
> 
> 
> 2013/2/12 aaron morton <aaron@thelastpickle.com>
>> So is it possible to delete all the data inserted in some CF between 2 dates or data
older than 1 month ?
> No. 
> 
> You need to issue row level deletes. If you don't know the row key you'll need to do
range scans to locate them. 
> 
> If you are deleting parts of wide rows consider reducing the min_compaction_level_threshold
on the CF to 2
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 12/02/2013, at 4:21 AM, Alain RODRIGUEZ <arodrime@gmail.com> wrote:
> 
>> Hi,
>> 
>> I would like to know if there is a way to delete old/unused data easily ?
>> 
>> I know about TTL but there are 2 limitations of TTL:
>> 
>> - AFAIK, there is no TTL on counter columns
>> - TTL need to be defined at write time, so it's too late for data already inserted.
>> 
>> I also could use a standard "delete" but it seems inappropriate for such a massive.
>> 
>> In some cases, I don't know the row key and would like to delete all the rows starting
by, let's say, "1050#..." 
>> 
>> Even better, I understood that columns are always inserted in C* with (name, value,
timestamp). So is it possible to delete all the data inserted in some CF between 2 dates or
data older than 1 month ?
>> 
>> Alain
> 
> 


Mime
View raw message