zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zk questions <zkquesti...@gmail.com>
Subject Re: Problem recovering from a bad reconfig (3.5)
Date Sat, 09 Nov 2013 20:11:22 GMT
Just realized that attachments don't go through, here it is linked instead:
https://docs.google.com/file/d/0B3K7QIlpXfSXdl92aHhxUXVDRkk/edit?usp=drive_web


On Sat, Nov 9, 2013 at 10:59 AM, zk questions <zkquestions@gmail.com> wrote:

> Hi,
>
> I've been testing out the dynamic reconfig feature of 3.5 along with using
> this patch (https://issues.apache.org/jira/browse/ZOOKEEPER-1691) and I'm
> having an issue where my zk cluster won't allow me to perform further
> reconfigs.
> So here's what I'm doing:
> 1) Start nodes 1 and 2
> 2) Invoke reconfig on 1 to add 2; this suceeds
> 3) Start node 3 with the initial configuration with the dynamic config set
> to just 2 and 3, where 2 isn't a leader (manually verified)
> 4) Invoke reconfig on 2 to add 3; this fails, with an error indicating
> that another reconfig in progress
> 5) Then I restart 3 with the configuration containing just 1 and 3
> 6) Then I try again to add 3 to the cluster by invoking reconfig on 1 to
> add 3; and again I see an error indicating that another reconfig is in
> progress
>
> FWIW: I'm testing this scenario to simulate the situation where I'm
> automating the reconfig process and the dynamic configuration for 3 ends up
> containing a node that isn't the leader.
>
> I was wondering what I should do in this situation to recover from the
> failure at step 3 so that we can fix the dynamic config and then attempt a
> proper reconfig (steps 4 - 6)?
>
> I've also attached a tar containing a script to automatically reproduce
> the steps and problem I'm seeing above.
>
> Thanks.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message