incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Віталій Тимчишин <tiv...@gmail.com>
Subject Re: any ways to have compaction use less disk space?
Date Mon, 24 Sep 2012 09:02:46 GMT
Why so?
What are pluses and minuses?
As for me, I am looking for number of files in directory.
700GB/512MB*5(files per SST) = 7000 files, that is OK from my view.
700GB/5MB*5 = 700000 files, that is too much for single directory, too much
memory used for SST data, too huge compaction queue (that leads to strange
pauses, I suppose because of compactor thinking what to compact next),...

2012/9/23 Aaron Turner <synfinatic@gmail.com>

> On Sun, Sep 23, 2012 at 8:18 PM, Віталій Тимчишин <tivv00@gmail.com>
> wrote:
> > If you think about space, use Leveled compaction! This won't only allow
> you
> > to fill more space, but also will shrink you data much faster in case of
> > updates. Size compaction can give you 3x-4x more space used than there
> are
> > live data. Consider the following (our simplified) scenario:
> > 1) The data is updated weekly
> > 2) Each week a large SSTable is written (say, 300GB) after full update
> > processing.
> > 3) In 3 weeks you will have 1.2TB of data in 3 large SSTables.
> > 4) Only after 4th week they all will be compacted into one 300GB SSTable.
> >
> > Leveled compaction've tamed space for us. Note that you should set
> > sstable_size_in_mb to reasonably high value (it is 512 for us with ~700GB
> > per node) to prevent creating a lot of small files.
>
> 512MB per sstable?  Wow, that's freaking huge.  From my conversations
> with various developers 5-10MB seems far more reasonable.   I guess it
> really depends on your usage patterns, but that seems excessive to me-
> especially as sstables are promoted.
>
>
-- 
Best regards,
 Vitalii Tymchyshyn

Mime
View raw message