cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <>
Subject Re: repair takes 10x more time in one DC compared to the other
Date Wed, 25 Jun 2014 16:48:51 GMT
I see. Well, you shouldn't use both "-local" and "-pr" together, they don't
make sense together. Which is the reason why their combination will be
rejected in 2.0.9 (you can check for details).
Basically, the result of using both is that lots of stuffs don't get

On Wed, Jun 25, 2014 at 6:11 PM, Paulo Ricardo Motta Gomes <> wrote:

> Thanks for the explanation, but I got slightly confused:
> From my understanding, you just described the behavior of the
> -pr/--partitioner-range option: "Repair only the first range returned by
> the partitioner for the node." , so I would understand that repairs in the
> same CFs in different DCs with only the -pr option could take different
> times.
> However according to the description of the -local/--in-local-dc option,
> it "only repairs against nodes in the same data center", but you said that "the
> range will be repaired for all replica in all data-centers", even with the
> "-local" option, or did you confuse it with "-pr" option?
> In any case, I'm using both "-local" and "-pr" options, what is the
> expected behavior in that case?
> Cheers,
> On Wed, Jun 25, 2014 at 12:46 PM, Sylvain Lebresne <>
> wrote:
>> TL;DR, this is not unexpected and this is perfectly fine.
>> For every node, 'repair --local' will repair the "primary" (where primary
>> means "the first range on the ring picked by the consistent hashing for
>> this node given its token", nothing more) range of the node in the ring.
>> And that range will be repaired for all replica in all data-centers. When
>> you assign tokens to multiple DC, it's actually pretty common to offset the
>> tokens of one DC slightly compared to the other one. This will result in
>> the "primary" ranges being always small in one DC but not the other. But
>> please note that this perfectly ok, it does not imply any imbalance in
>> data-centers. It also don't really mean that the node of one DC actually do
>> a lot more work than the other ones: all nodes most likely contribute
>> roughly the same amount of work to the repair. It only mean that the nodes
>> of one DC "coordinate" more repair work that those of the other DC. Which
>> is not really a big deal since coordinating a repair is cheap.
>> --
>> Sylvain
>> On Wed, Jun 25, 2014 at 4:43 PM, Paulo Ricardo Motta Gomes <
>>> wrote:
>>> Hello,
>>> I'm running repair on a large CF with the "--local" flag in 2 different
>>> DCs. In one of the DCs the operation takes about 1 hour per node, while in
>>> the other it takes 10 hours per node.
>>> I would expect the times to differ, but not so much. The writes on that
>>> CF all come from the DC where it takes 10 hours per node, could this be the
>>> cause why it takes so long on this DC?
>>> Additional info: C* 1.2.16, both DCs have the same replication factor.
>>> Cheers,
>>> --
>>> *Paulo Motta*
>>> Chaordic | *Platform*
>>> * <>*
>>> +55 48 3232.3200
> --
> *Paulo Motta*
> Chaordic | *Platform*
> * <>*
> +55 48 3232.3200

View raw message