cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kurt Greaves <k...@instaclustr.com>
Subject Re: nodetool repair with -pr and -dc
Date Thu, 11 Aug 2016 23:14:33 GMT
-D does not do what you think it does. I've quoted the relevant
documentation from the README:

>
> <https://github.com/BrianGallew/cassandra_range_repair#multiple-datacenters>Multiple
> Datacenters
>
> If you have multiple datacenters in your ring, then you MUST specify the
> name of the datacenter containing the node you are repairing as part of the
> command-line options (--datacenter=DCNAME). Failure to do so will result in
> only a subset of your data being repaired (approximately
> data/number-of-datacenters). This is because nodetool has no way to
> determine the relevant DC on its own, which in turn means it will use the
> tokens from every ring member in every datacenter.
>


On 11 August 2016 at 12:24, Paulo Motta <pauloricardomg@gmail.com> wrote:

> > if we want to use -pr option ( which i suppose we should to prevent
> duplicate checks) in 2.0 then if we run the repair on all nodes in a single
> DC then it should be sufficient and we should not need to run it on all
> nodes across DC's?
>
> No, because the primary ranges of the nodes in other DCs will be missing
> repair, so you should either run with -pr in all nodes in all DCs, or
> restrict repair to a specific DC with -local (and have duplicate checks).
> Combined -pr and -local are only supported on 2.1
>
>
> 2016-08-11 1:29 GMT-03:00 Anishek Agarwal <anishek@gmail.com>:
>
>> ok thanks, so if we want to use -pr option ( which i suppose we should to
>> prevent duplicate checks) in 2.0 then if we run the repair on all nodes in
>> a single DC then it should be sufficient and we should not need to run it
>> on all nodes across DC's ?
>>
>>
>>
>> On Wed, Aug 10, 2016 at 5:01 PM, Paulo Motta <pauloricardomg@gmail.com>
>> wrote:
>>
>>> On 2.0 repair -pr option is not supported together with -local, -hosts
>>> or -dc, since it assumes you need to repair all nodes in all DCs and it
>>> will throw and error if you try to run with nodetool, so perhaps there's
>>> something wrong with range_repair options parsing.
>>>
>>> On 2.1 it was added support to simultaneous -pr and -local options on
>>> CASSANDRA-7450, so if you need that you can either upgade to 2.1 or
>>> backport that to 2.0.
>>>
>>>
>>> 2016-08-10 5:20 GMT-03:00 Anishek Agarwal <anishek@gmail.com>:
>>>
>>>> Hello,
>>>>
>>>> We have 2.0.17 cassandra cluster(*DC1*) with a cross dc setup with a
>>>> smaller cluster(*DC2*).  After reading various blogs about
>>>> scheduling/running repairs looks like its good to run it with the following
>>>>
>>>>
>>>> -pr for primary range only
>>>> -st -et for sub ranges
>>>> -par for parallel
>>>> -dc to make sure we can schedule repairs independently on each Data
>>>> centre we have.
>>>>
>>>> i have configured the above using the repair utility @
>>>> https://github.com/BrianGallew/cassandra_range_repair.git
>>>>
>>>> which leads to the following command :
>>>>
>>>> ./src/range_repair.py -k [keyspace] -c [columnfamily name] -v -H
>>>> localhost -p -D* DC1*
>>>>
>>>> but looks like the merkle tree is being calculated on nodes which are
>>>> part of other *DC2.*
>>>>
>>>> why does this happen? i thought it should only look at the nodes in
>>>> local cluster. however on nodetool the* -pr* option cannot be used
>>>> with *-local* according to docs @https://docs.datastax.com/en/
>>>> cassandra/2.0/cassandra/tools/toolsRepair.html
>>>>
>>>> so i am may be missing something, can someone help explain this please.
>>>>
>>>> thanks
>>>> anishek
>>>>
>>>
>>>
>>
>


-- 
Kurt Greaves
kurt@instaclustr.com
www.instaclustr.com

Mime
View raw message