incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <>
Subject Re: -pr vs. no -pr
Date Fri, 01 Mar 2013 13:46:38 GMT
Sweeet, I %100 understand this now from these last few emails.  It has always been a bit confusing.


From: Sylvain Lebresne <<>>
Reply-To: "<>" <<>>
Date: Friday, March 1, 2013 4:36 AM
To: "<>" <<>>
Subject: Re: -pr vs. no -pr

On Thu, Feb 28, 2013 at 11:39 PM, Hiller, Dean <<>>
Isn't it true if I have 6 nodes, I could run nodetool repair on just 2 nodes(RF=3) instead
of using nodetool repair –pr???

Yes, it is true.

And to precise further, in your case you have 2 options:
 1) doing repair *without* -pr on 2 nodes (assuming you pick the correct 2 nodes, it's *not*
any 2 nodes)
 2) doing a repair *with* -pr on the 6 nodes

Both of those cases would 1) repair the full ring and 2) do the same amount of work.

What is the advantage of –pr then?

As it happens, your case is a special case. You have a number of node that is a multiple of
your replication factor. Now if that wasn't the case (say 5, 7 or 8 nodes with RF=3), then
there is *no way* you can repair *without* -pr the whole cluster without doing *more* work
than by doing a repair *with* -pr on all nodes.

So the advantages of --pr (which btw, should be use for repair the whole cluster, not when
you want to rebuild a specific node) are:
 1) it always do the minimum of work, while repair without --pr is wasteful if the number
of nodes is not a multiple of the replication factor (no matter how smart you are at scheduling
the repairs).
 2) even if your number of nodes is a multiple of the replication factor, you still have to
make sure you pick the right N/RF nodes to repair if you don't use -pr. If you don't pick
the correct ones, you will not repair the full ring. Using -pr is much more shoot-footing
free: you have to run it on every node, period.


View raw message