cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Watanabe Maki <watanabe.m...@gmail.com>
Subject Re: nodetool repair & compact
Date Wed, 06 Apr 2011 22:59:08 GMT
Thanks a lot. It has became clear for me.

From iPhone


On 2011/04/06, at 23:51, Sylvain Lebresne <sylvain@datastax.com> wrote:

> On Tue, Apr 5, 2011 at 9:03 PM, Maki Watanabe <watanabe.maki@gmail.com> wrote:
>> Thanks Sylvain, it's very clear.
>> But should I still need to force major compaction regularly to clear tombstones?
>> I know that minor compaction clear the tombstones after 0.7, but
>> maximumCompactionThreshold limits the maximum number of sstable which
>> will be merged at once, so to GC all tombstones in all sstable in
>> gc_grace_period, it is safe to run "nodetool compact" at least once in
>> gc_grace_period, isn't it?
> 
> You don't *need* tombstones to be cleared within gc_grace_period. What you
> need is to make sure for a given tombstone t, that each node will get t within
> gc_grace_period. This means that if a node dies, you need it to be up again
> and have nodetool repair ran before gc_grace_period, otherwise there may
> be some tombstones that this node will never see (and thus deleted data
> could be resurrected by this node).
> 
> So repair should be run at least once in gc_grace_period to be on the safe side.
> Compact is not necessary however. The only downside of not running compact
> regularly is that some tombstones may take longer to be removed (since minor
> compaction are potentially less efficient at removing them), which
> really only impact
> disk space usage. And given that major compaction are fairly heavy on ressource
> usage and have that nasty effect of producing only one huge sstable, my advice
> would be to not run major compaction unless you have good reason to suspect
> you need it.
> 
> --
> Sylvain
> 
>> 
>> maki
>> 
>> 2011/4/6 Sylvain Lebresne <sylvain@datastax.com>:
>>> On Tue, Apr 5, 2011 at 12:01 AM, Maki Watanabe <watanabe.maki@gmail.com>
wrote:
>>>> Hello,
>>>> On reading O'Reilly's Cassandra book and wiki, I'm a bit confusing on
>>>> nodetool repair and compact.
>>>> I believe we need to run nodetool repair regularly, and it synchronize
>>>> all replica nodes at the end.
>>>> According to the documents the "repair" invokes major compaction also
>>>> (as side effect?).
>>> 
>>> Those documents are wrong then. A repair does not trigger a major
>>> compaction. The only thing that makes it similar to a major compaction is
>>> that it will iterate over all the sstables. But for instance, you won't end
>>> up with one big sstable at the end of repair as you would with a major
>>> compaction.
>>> 
>>>> Will this "major compaction" apply on replica nodes too?
>>>> 
>>>> If I have 3 node ring and CF of RF=3, what should I do periodically on
>>>> this system is:
>>>> - nodetool repair on one of the nodes
>>>> or
>>>> - nodetool repair on one of the nodes, and nodetool compact on 2 of the nodes
>>>> ?
>>> 
>>> So as said, repair and compact are independent. You should
>>> periodically run nodetool
>>> repair (on one of your nodes in your case as you said). However, it is
>>> not advised anymore
>>> to run nodetool compact regularly unless you have a good reason to.
>>> 
>>> --
>>> Sylvain
>>> 
>> 

Mime
View raw message