incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: Compaction doubles disk space
Date Tue, 29 Mar 2011 10:29:18 GMT
> BTW, given that compaction requires double disk spaces, does it mean that I
> should never reach half of my total disk space?
> e.g. if I have 505GB data on 1TB disk, I cannot even delete any data at all.

It is not so black and white. What is true is that in practice
reaching half the disk should
be a first alert, from which you should start to monitor things more
carefully to avoid problems.

There is 2 kind of compaction, major and minor ones. The major ones
are the ones that compact
all the sstables for a given column family. Minor compaction are the
one that are trigger automatically
and regularly. By definition they don't compact everything and thus
don't need half your disk space.
Note however that over time, even minor compaction will require a fair
amount of disk space and could
very well require as much as half the disk space, but in practice it
won't happen all the time.

There other thing is that even a major compaction only have to be
applied to one Column Family at a
time. So unless you only have one CF or 90% of you data in one CF (and
for the record, there's nothing
wrong with that, it's just not necessarily your case), you won't need
exactly half you disk for a
compaction.

All this to say that it is not as if as simple as: you've reached half
your disk space you are necessarily doomed.
Chances are you'll never hit any problem until you're say 70% full (or
more). But there is no fullproof number
here so I said earlier, hitting 50% should be a first sign that you
may need a plan for the future.

--
Sylvain

Mime
View raw message