incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <rc...@eventbrite.com>
Subject Re: Is it possible to control the sstable file size in incremental backup or snapshot
Date Mon, 23 Sep 2013 18:33:16 GMT
On Fri, Sep 20, 2013 at 6:56 PM, java8964 java8964 <java8964@hotmail.com>wrote:

> I noticed the snapshot and incremental backup sstable files size generated
> from our production environment vary dramatically. Some files can be
> hundreds of M, or even close to G, but a lot of files are even less than 1k
> bytes, especially in the incremental backup.
>
> Is there a way to control the size of SSTable files generated in the
> snapshot or incremental backup?  I am talking about one column family.
> Ideally for this column family, I would like to make every SSTable files
> around 500M to 1G.
>

There is no way to control SSTable file at flush, as their size is directly
related to Memtable size.

One counter-intuitive aspect of snapshots is that, as hard links, if you
"du" multiple snapshot directories, it will only count each referred-to
file once.

So :

du -sk list/ of/ snapshot/ dirs/

Will give different results than :

for name in list/ of/ snapshot/ dirs/
do
   du -sk "$name"
done

This also has the side effect of making snapshots effectively *grow* in
size as time passes, as they progressively contain the only copy of
obsolete data.

=Rob

Mime
View raw message