incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <>
Subject Re: nodetool repair & compact
Date Wed, 06 Apr 2011 14:51:58 GMT
On Tue, Apr 5, 2011 at 9:03 PM, Maki Watanabe <> wrote:
> Thanks Sylvain, it's very clear.
> But should I still need to force major compaction regularly to clear tombstones?
> I know that minor compaction clear the tombstones after 0.7, but
> maximumCompactionThreshold limits the maximum number of sstable which
> will be merged at once, so to GC all tombstones in all sstable in
> gc_grace_period, it is safe to run "nodetool compact" at least once in
> gc_grace_period, isn't it?

You don't *need* tombstones to be cleared within gc_grace_period. What you
need is to make sure for a given tombstone t, that each node will get t within
gc_grace_period. This means that if a node dies, you need it to be up again
and have nodetool repair ran before gc_grace_period, otherwise there may
be some tombstones that this node will never see (and thus deleted data
could be resurrected by this node).

So repair should be run at least once in gc_grace_period to be on the safe side.
Compact is not necessary however. The only downside of not running compact
regularly is that some tombstones may take longer to be removed (since minor
compaction are potentially less efficient at removing them), which
really only impact
disk space usage. And given that major compaction are fairly heavy on ressource
usage and have that nasty effect of producing only one huge sstable, my advice
would be to not run major compaction unless you have good reason to suspect
you need it.


> maki
> 2011/4/6 Sylvain Lebresne <>:
>> On Tue, Apr 5, 2011 at 12:01 AM, Maki Watanabe <> wrote:
>>> Hello,
>>> On reading O'Reilly's Cassandra book and wiki, I'm a bit confusing on
>>> nodetool repair and compact.
>>> I believe we need to run nodetool repair regularly, and it synchronize
>>> all replica nodes at the end.
>>> According to the documents the "repair" invokes major compaction also
>>> (as side effect?).
>> Those documents are wrong then. A repair does not trigger a major
>> compaction. The only thing that makes it similar to a major compaction is
>> that it will iterate over all the sstables. But for instance, you won't end
>> up with one big sstable at the end of repair as you would with a major
>> compaction.
>>> Will this "major compaction" apply on replica nodes too?
>>> If I have 3 node ring and CF of RF=3, what should I do periodically on
>>> this system is:
>>> - nodetool repair on one of the nodes
>>> or
>>> - nodetool repair on one of the nodes, and nodetool compact on 2 of the nodes
>>> ?
>> So as said, repair and compact are independent. You should
>> periodically run nodetool
>> repair (on one of your nodes in your case as you said). However, it is
>> not advised anymore
>> to run nodetool compact regularly unless you have a good reason to.
>> --
>> Sylvain

View raw message