incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Baronti <f.baro...@list-group.com>
Subject Time series data and deletion
Date Fri, 08 Mar 2013 17:13:06 GMT
Hello,

we are using Cassandra for storing time series data. We never update, only append; we plan
to store 1 year worth of 
data, occupying something around 200Gb. I'm trying to understand what will happen when we
start deleting the old data.

With size tiered compaction, suppose we have one 160Gb sstable and some smaller tables totalling
40Gb. My understanding 
is that, even if we start deleting, we will have to wait for 3 more 160Gb tables to appear,
in order to have the first 
sstable compacted and the disk space freed. So although we need to store 200Gb worth of data,
we'll need something like 
800Gb disk space in order to be on the safe side, right?

What would happen instead with leveled compaction? And why is the default sstable size so
small (5Mb)? If we need to 
store 200Gb, this means we will have 40k sstables; since each one makes 5 files, we'll have
200k files in a single 
directory, which we'm afraid will undermine the stability of the file system.

Thank you for your suggestions!

Flavio


Mime
View raw message