incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Time series data and deletion
Date Mon, 11 Mar 2013 13:42:10 GMT
> I'm trying to understand what will happen when we start deleting the old data.
Are you going to delete data or use the TTL?

> With size tiered compaction, suppose we have one 160Gb sstable and some smaller tables
totalling 40Gb.
Not sure on that, it depends on the work load. 

>  My understanding is that, even if we start deleting, we will have to wait for 3 more
160Gb tables to appear, in order to have the first sstable compacted and the disk space freed.

v1.2 will run compactions on single SSTables that have a high number of tombstones 
https://issues.apache.org/jira/browse/CASSANDRA-3442
https://issues.apache.org/jira/browse/CASSANDRA-4234

>  So although we need to store 200Gb worth of data, we'll need something like 800Gb disk
space in order to be on the safe side, right?

You want to keep the disks below 75% capacity, and want to have free space to handle node
moves etc.  
I do not think you need 800GB because of tombstones deletions. 

> What would happen instead with leveled compaction? 
Levelled compaction is more suited to workloads that have a high insert/delete ratio. In your
case, write once read many data is will suited to Sized Tiered. 

Cheers
 
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 8/03/2013, at 9:13 AM, Flavio Baronti <f.baronti@list-group.com> wrote:

> Hello,
> 
> we are using Cassandra for storing time series data. We never update, only append; we
plan to store 1 year worth of data, occupying something around 200Gb. I'm trying to understand
what will happen when we start deleting the old data.
> 
> With size tiered compaction, suppose we have one 160Gb sstable and some smaller tables
totalling 40Gb. My understanding is that, even if we start deleting, we will have to wait
for 3 more 160Gb tables to appear, in order to have the first sstable compacted and the disk
space freed. So although we need to store 200Gb worth of data, we'll need something like 800Gb
disk space in order to be on the safe side, right?
> 
> What would happen instead with leveled compaction? And why is the default sstable size
so small (5Mb)? If we need to store 200Gb, this means we will have 40k sstables; since each
one makes 5 files, we'll have 200k files in a single directory, which we'm afraid will undermine
the stability of the file system.
> 
> Thank you for your suggestions!
> 
> Flavio
> 


Mime
View raw message