"However the old rows will not be purged from disk unless all fragments of the row are involved in a compaction process. So it may take some time to purge from disk, depending on the workload. "

http://wiki.apache.org/cassandra/Counters

The doc says: "Counter removal is intrinsically limited. For instance, if you issue very quickly the sequence "increment, remove, increment" it is possible for the removal to be lost (if for some reason the remove happens to be the last received messages). Hence, removal of counters is provided for definitive removal only, that is when the deleted counter is not increment afterwards. This holds for row deletion too: if you delete a row of counters, incrementing any counter in that row (that existed before the deletion) will result in an undetermined behavior. Note that if you need to reset a counter, one option (that is unfortunately not concurrent safe) could be to read its value and add -value."

Just wanted to add that we experienced it. While data is purged from disk, we couldn't write anything in that row. I mean, weren't enable to create any new column.

I just wanted to let you know in case it could help.



2013/2/18 aaron morton <aaron@thelastpickle.com>
Sorry, missed the Counters part.

You are probably interested in this one 
https://issues.apache.org/jira/browse/CASSANDRA-5228

Add your need to ticket to help it along. IMHO if you have write once, read many time series data the SSTables are effectively doing horizontal partitioning for you. So been able to "drop a partition" would make life easier. 

If you can delete entire row then the deletes have less impact than per column. However the old rows will not be purged from disk unless all fragments of the row are involved in a compaction process. So it may take some time to purge from disk, depending on the workload. 

Cheers
 
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton

On 18/02/2013, at 10:43 AM, Ilya Grebnov <ilya@metricshub.com> wrote:

According to https://issues.apache.org/jira/browse/CASSANDRA-2103 There is no support for time to live (TTL) on counter columns. Did I miss something?
 
Thanks,
Ilya
From: aaron morton [mailto:aaron@thelastpickle.com] 
Sent: Sunday, February 17, 2013 9:16 AM
To: user@cassandra.apache.org
Subject: Re: Deleting old items during compaction (WAS: Deleting old items)
 
That's what the TTL does. 
 
Manually delete all the older data now, then start using TTL. 
 
Cheers
 
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand
 
@aaronmorton
 
On 13/02/2013, at 11:08 PM, Ilya Grebnov <ilya@metricshub.com> wrote:


Hi,
 
We looking for solution for same problem. We have a wide column family with counters and we want to delete old data like 1 months old. One of potential ideas was to implement hook in compaction code and drop column which we don’t need. Is this a viable option?
 
Thanks,
Ilya
From: aaron morton [mailto:aaron@thelastpickle.com] 
Sent: Tuesday, February 12, 2013 9:01 AM
To: user@cassandra.apache.org
Subject: Re: Deleting old items
 
So is it possible to delete all the data inserted in some CF between 2 dates or data older than 1 month ?
No. 
 
You need to issue row level deletes. If you don't know the row key you'll need to do range scans to locate them. 
 
If you are deleting parts of wide rows consider reducing the min_compaction_level_threshold on the CF to 2
 
Cheers
 
 
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand
 
@aaronmorton
 
On 12/02/2013, at 4:21 AM, Alain RODRIGUEZ <arodrime@gmail.com> wrote:



Hi,
 
I would like to know if there is a way to delete old/unused data easily ?
 
I know about TTL but there are 2 limitations of TTL:
 
- AFAIK, there is no TTL on counter columns
- TTL need to be defined at write time, so it's too late for data already inserted.
 
I also could use a standard "delete" but it seems inappropriate for such a massive.
 
In some cases, I don't know the row key and would like to delete all the rows starting by, let's say, "1050#..." 
 
Even better, I understood that columns are always inserted in C* with (name, value, timestamp). So is it possible to delete all the data inserted in some CF between 2 dates or data older than 1 month ?
 
Alain