cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <jji...@gmail.com>
Subject Re: Recover lost node from backup or evict/re-add?
Date Thu, 13 Jun 2019 13:16:11 GMT


> On Jun 13, 2019, at 2:52 AM, Oleksandr Shulgin <oleksandr.shulgin@zalando.de> wrote:
> 
>> On Wed, Jun 12, 2019 at 4:02 PM Jeff Jirsa <jjirsa@gmail.com> wrote:
> 
>> To avoid violating consistency guarantees, you have to repair the replicas while
the lost node is down
> 
> How do you suggest to trigger it?  Potentially replicas of the primary range for the
down node are all over the local DC, so I would go with triggering a full cluster repair with
Cassandra Reaper.  But isn't it going to fail because of the down node?  

Im not sure there’s an easy and obvious path here - this is something TLP may want to enhance
reaper to help with. 

You have to specify the ranges with -st/-et, and you have to tell it to ignore the down host
with -hosts. With vnodes you’re right that this may be lots and lots of ranges all over
the ring.

There’s a patch proposed (maybe committed in 4.0) that makes this a nonissue by allowing
bootstrap to stream one repaired set and all of the unrepaired replica data (which is probably
very small if you’re running IR regularly), which accomplished the same thing.

> 
> It is also documented (I believe) that one should repair the node after it finishes the
"replace address" procedure.  So should one repair before and after?

You do not need to repair after the bootstrap if you repair before. If the docs say that,
they’re wrong. The joining host gets writes during bootstrap and consistency levels are
altered during bootstrap to account for the joining host.
Mime
View raw message