On Thu, Feb 28, 2013 at 11:39 PM, Hiller, Dean <Dean.Hiller@nrel.gov> wrote:
> Isn't it true if I have 6 nodes, I could run nodetool repair on just 2
> nodes(RF=3) instead of using nodetool repair –pr???
>
Yes, it is true.
And to precise further, in your case you have 2 options:
1) doing repair *without* pr on 2 nodes (assuming you pick the correct 2
nodes, it's *not* any 2 nodes)
2) doing a repair *with* pr on the 6 nodes
Both of those cases would 1) repair the full ring and 2) do the same amount
of work.
> What is the advantage of –pr then?
As it happens, your case is a special case. You have a number of node that
is a multiple of your replication factor. Now if that wasn't the case (say
5, 7 or 8 nodes with RF=3), then there is *no way* you can repair *without*
pr the whole cluster without doing *more* work than by doing a repair
*with* pr on all nodes.
So the advantages of pr (which btw, should be use for repair the whole
cluster, not when you want to rebuild a specific node) are:
1) it always do the minimum of work, while repair without pr is wasteful
if the number of nodes is not a multiple of the replication factor (no
matter how smart you are at scheduling the repairs).
2) even if your number of nodes is a multiple of the replication factor,
you still have to make sure you pick the right N/RF nodes to repair if you
don't use pr. If you don't pick the correct ones, you will not repair the
full ring. Using pr is much more shootfooting free: you have to run it on
every node, period.

Sylvain
