zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Shraer <shra...@gmail.com>
Subject Re: dynamic config file number
Date Mon, 18 Jun 2018 23:02:48 GMT
The way it was implemented, is that the version (which is printed in your
log, like version=1f001cc8d5) is not stored in the
dynamic config file, but is actually part of its file name. It corresponds
to the zxid at which the configuration was committed.
You should never change that manually, or copy it from a different cluster.
Instead you should either start with a static config file
which will then be automatically converted to a dynamic one, or with an
un-numbered dynamic one, as you suggest.
https://zookeeper.apache.org/doc/r3.5.3-beta/zookeeperReconfig.html#sc_reconfig_file

I don't remember exactly, but I'm guessing that when a server boots, it
uses the version in the file name to bootstrap its config info.
Then, when you reconfig, the zxid of the reconfig (which is also the
version of the new config) is lower than the config version your cluster
has (probably the new cluster committed less ops than the previous one, so
its zxid is smaller)
so it fails with an error that the config is stale (has lower zxid /
version than the one the server already has).


Alex

On Mon, Jun 18, 2018 at 8:04 AM, oo4load <c.turksema@gmail.com> wrote:

> I had a problem getting dynamic reconfig to work on new / clean clusters,
> if
> I copied the zoo.cfg and zoo.cfg.dynamic.(number) file over from an older
> installation.
>
>
> Here's what happens:
>
> [zk: localhost:2181(CONNECTED) 2] config
> server.1=srv5703h:2888:3888:participant;0.0.0.0:2181
> server.2=srv5703k:2888:3888:participant;0.0.0.0:2181
> server.3=srv5704y:2888:3888:participant;0.0.0.0:2181
> version=1f001cc8d5
>
> [zk: localhost:2181(CONNECTED) 3] reconfig -remove 3
> Committed new configuration:
> server.1=srv5703h:2888:3888:participant;0.0.0.0:2181
> server.2=srv5703k:2888:3888:participant;0.0.0.0:2181
> server.3=srv5704y:2888:3888:participant;0.0.0.0:2181
> version=1f001cc8d5
>
>
> As you can see the config version doesnt change.
> If you check the filesystem, on each Zookeeper a ".next" file is created
> with the new config, but it seems like it's never committed.
>
> -rw-r-----. 1 prof prof 282 Jun 18 12:39 zoo.cfg
> -rw-r-----. 1 prof prof 159 Jun 18 15:25 zoo.cfg.dynamic.1f001cc8d5
> -rw-r-----. 1 prof prof 123 Jun 18 15:26 zoo.cfg.dynamic.next
>
>
> On the Zookeepers where the reconfig command was NOT run, the logs show the
> following message:
> 2018-06-18 15:26:56,491 [myid:3] - INFO  [ProcessThread(sid:3
> cport:-1)::PrepRequestProcessor@476] - Incremental reconfig
> 2018-06-18 15:26:56,493 [myid:3] - ERROR [ProcessThread(sid:3
> cport:-1)::QuorumPeer@1460] - setLastSeenQuorumVerifier called with stale
> config 4294967306. Current version: 133145872597
>
>
> After growing a ton of grey hairs we figured out that a new cluster must
> start with an "unnumbered" dynamic config file, and copying over an
> existing
> config always fails. Can anyone explain why that is ?
>
> Thanks,
>
> Chris
>
>
>
> --
> Sent from: http://zookeeper-user.578899.n2.nabble.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message