Isn't there more to it than that. You really have nodes responsible for
token ranges like so(using describe ring)
What we see is this from our describe ring(1 to 6 are token ranges while
A to F are servers).
A - 1, 2, 3
B - 2, 3, 4
C - 3, 4, 5
D - 4, 5, 6
E - 5, 6, 1
F - 6, 1, 2
With -pr, only token range 1 is repaired I think, right? 2 and 3 are only
repaired without the -pr option? This means if I have a node that I just
joined the cluster, I should "not" be using -pr as 2 and 3 on node A will
not be up to date. Using -pr is nice if I am going to repair every single
node and is nice for the cron job that has to happen before
gc_grace_seconds. Am I wrong here? Ie. -pr is really only good for use
in the cron job as it would miss 2 and 3 above. I could run the cron on
just two servers but then my nodes are different which can be a hassle.
Please verify that is what you believe is what happens as well?
Thanks,
Dean
On 2/28/13 5:58 PM, "Takenori Sato(Cloudian)" wrote:
>Hi,
>
>Please note that I confirmed on v1.0.7.
>
> > I mean a repair involves all three nodes and pushes and pulls data,
>right?
>
>Yes, but that's how -pr works. A repair without -pr does more.
>
>For example, suppose you have a ring with RF=3 like this.
>
>A - B - C - D - E - F
>
>Then, a repair on A without -pr does for 3 ranges as follows:
>[A, B, C]
>[E, F, A]
>[F, A, B]
>
>Among them, the first one, [A, B, C] is the primary range of A.
>
>So, with -pr, a repair runs only for:
>[A, B, C]
>
> > I could run nodetool repair on just 2 nodes(RF=3) instead of using
>nodetool repair pr???
>
>Yes.
>
>You need to run two repairs on A and D.
>
> > What is the advantage of pr then?
>
>Whenever you want to minimize rapair impacts.
>
>For example, suppose you got one node down for a while, and bring it
>back to the cluster.
>
>You need to run rapair without affecting the entire cluster. Then, -pr
>is the option.
>
>Thanks,
>Takenori
>
