incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: -pr vs. no -pr
Date Fri, 01 Mar 2013 11:36:28 GMT
On Thu, Feb 28, 2013 at 11:39 PM, Hiller, Dean <Dean.Hiller@nrel.gov> wrote:

> Isn't it true if I have 6 nodes, I could run nodetool repair on just 2
> nodes(RF=3) instead of using nodetool repair –pr???
>

Yes, it is true.

And to precise further, in your case you have 2 options:
 1) doing repair *without* -pr on 2 nodes (assuming you pick the correct 2
nodes, it's *not* any 2 nodes)
 2) doing a repair *with* -pr on the 6 nodes

Both of those cases would 1) repair the full ring and 2) do the same amount
of work.


> What is the advantage of –pr then?


As it happens, your case is a special case. You have a number of node that
is a multiple of your replication factor. Now if that wasn't the case (say
5, 7 or 8 nodes with RF=3), then there is *no way* you can repair *without*
-pr the whole cluster without doing *more* work than by doing a repair
*with* -pr on all nodes.

So the advantages of --pr (which btw, should be use for repair the whole
cluster, not when you want to rebuild a specific node) are:
 1) it always do the minimum of work, while repair without --pr is wasteful
if the number of nodes is not a multiple of the replication factor (no
matter how smart you are at scheduling the repairs).
 2) even if your number of nodes is a multiple of the replication factor,
you still have to make sure you pick the right N/RF nodes to repair if you
don't use -pr. If you don't pick the correct ones, you will not repair the
full ring. Using -pr is much more shoot-footing free: you have to run it on
every node, period.

--
Sylvain

Mime
View raw message