incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: nodetool repair -pr enough in this scenario?
Date Tue, 05 Jun 2012 08:01:43 GMT
On Tue, Jun 5, 2012 at 8:44 AM, Viktor Jevdokimov <
Viktor.Jevdokimov@adform.com> wrote:

>  Understand simple mechanics first, decide how to act later.****
>
> ** **
>
> Without –PR there’s no difference from which host to run repair, it runs
> for the whole 100% range, from start to end, the whole cluster, all nodes,
> at once.
>

That's not exactly true. A repair without -pr will repair all the ranges of
the node on which repair is ran. So it will only repair the ranges that the
node is a replica for. It will *not* repair the whole cluster (unless the
replication factor is equal to the number of nodes in the cluster but
that's a degenerate case). And hence it does matter on which host repair is
run (it always matter, whether you use -pr or not).

In general you want to use repair without -pr in case where you want to
repair a specific node. Typically, if a node was dead for a reasonably long
time, you may want to run a repair (without -pr) on that specific node to
have him catch up faster (faster that if you were only relying on
read-repair and hinted-handoff).

For repairing a whole cluster, as is the case for the weekly scheduled
repairs in the initial question, you want to use -rp. You *do not* want to
use repair without -pr in that case. You do not because for that task using
-pr is more efficient (and to be clear, not using -pr won't cause problems,
but it does is less efficient).

--
Sylvain



>
>
> With –PR it runs only for a primary range of a node you are running a
> repair.****
>
> Let say you have simple ring of 3 nodes with RF=2 and ranges (per node)
> N1=C-A, N2=A-B, N3=B-C (node tokens are N1=A, N2=B, N3=C). No rack, no DC
> aware.****
>
> So running repair with –PR on node N2 will only repair a range A-B, for
> which node N2 is a primary and N3 is a backup. N2 and N3 will synchronize
> A-B range one with other. For other ranges you need to run on other nodes.
> ****
>
> ** **
>
> Without –PR running on any node will repair all ranges, A-B, B-C, C-A. A
> node you run a repair without –PR is just a repair coordinator, so no
> difference, which one will be next time.****
>
> ** **
>
> ** **
>
>
>    Best regards / Pagarbiai
> *Viktor Jevdokimov*
> Senior Developer
>
> Email: Viktor.Jevdokimov@adform.com
> Phone: +370 5 212 3063, Fax +370 5 261 0453
> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
> Follow us on Twitter: @adforminsider <http://twitter.com/#!/adforminsider>
> What is Adform: watch this short video <http://vimeo.com/adform/display>
>  [image: Adform News] <http://www.adform.com>
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>   *From:* David Daeschler [mailto:david.daeschler@gmail.com]
> *Sent:* Tuesday, June 05, 2012 08:59
> *To:* user@cassandra.apache.org
> *Subject:* nodetool repair -pr enough in this scenario?****
>
> ** **
>
> Hello,****
>
> ** **
>
> Currently I have a 4 node cassandra cluster on CentOS64. I have been
> running nodetool repair (no -pr option) on a weekly schedule like:****
>
> ** **
>
> Host1: Tue, Host2: Wed, Host3: Thu, Host4: Fri****
>
> ** **
>
> In this scenario, if I were to add the -pr option, would this still be
> sufficient to prevent forgotten deletes and properly maintain consistency?
> ****
>
> ** **
>
> Thank you,
> - David ****
>

Mime
View raw message