zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Han <h...@apache.org>
Subject Re: [jira] [Commented] (ZOOKEEPER-3556) Dynamic configuration file can not be updated automatically after some zookeeper servers of zk cluster are down
Date Thu, 26 Sep 2019 04:31:56 GMT
>> There were recently a post here from someone who has implemented this

Maybe this one?
http://zookeeper-user.578899.n2.nabble.com/About-ZooKeeper-Dynamic-Reconfiguration-td7584271.html

On Wed, Sep 25, 2019 at 9:19 PM Alexander Shraer <shralex@gmail.com> wrote:

> There were recently a post here from someone who has implemented this, but
> I couldn't find it for some reason.
>
> Essentially I think that you'd need to monitor the "health" and
> connectivity of servers to the leader, and issue reconfig commands to
> remove them when you suspect that they're down or add them back when you
> think they're up.
> Notice that you always have to have at least a quorum of the ensemble, so
> issuing a reconfig command if a quorum is lost (or any other command) won't
> work.
> You could use the information exposed in ZK's 4 letter commands to decide
> whether you think a server is up and connected to the quorum or down.
> Ideally we could also use the leader's view on who is connected
> but it doesn't look like this is being exposed right now. You can also
> periodically issue test read/write operations on various servers to check
> if they're really operational
>
> https://github.com/apache/zookeeper/blob/1ca627b5a3105d80ed4d851c6e9f1a1e2ac7d64a/zookeeper-docs/src/main/resources/markdown/zookeeperAdmin.md#sc_4lw
>
> As accurate failure detection is impossible in async. systems, you'll need
> to decide how sensitive you are to potential failures vs false suspicions.
>
> Hope this helps...
>
> Alex
>
> On Wed, Sep 25, 2019 at 6:00 PM Gao,Wei <Wei.Gao@arcserve.com> wrote:
>
> > Hi Alexander Shraer,
> >  Could you please tell me how to implement automation on top?
> > Thank you very much!
> >
> > -----Original Message-----
> > From: Alexander Shraer (Jira) <jira@apache.org>
> > Sent: Thursday, September 26, 2019 1:27 AM
> > To: issues@zookeeper.apache.org
> > Subject: [jira] [Commented] (ZOOKEEPER-3556) Dynamic configuration file
> > can not be updated automatically after some zookeeper servers of zk
> cluster
> > are down
> >
> >
> >     [
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ZOOKEEPER-2D3556-3Fpage-3Dcom.atlassian.jira.plugin.system.issuetabpanels-3Acomment-2Dtabpanel-26focusedCommentId-3D16937925-23comment-2D16937925&d=DwIFaQ&c=ZmK7amRlbztwfC_NTU_hNw&r=bTmnMF5RGYcfg4qOcKQAYjkGGUtOB2jR22ryrk8hNWk&m=UNFnO3kfjtUL8Jievmh9VMXf_nTLKBCfuJsaxe6FshU&s=XxgusqUbHgFrxTfTTcYuxMWxol3W-1dJ7WVzUqh1HAE&e=
> > ]
> >
> > Alexander Shraer commented on ZOOKEEPER-3556:
> > ---------------------------------------------
> >
> > The described behavior is not a bug – currently reconfiguration requires
> > explicit action by an operator. One could implement automation on top. We
> > should consider this as a feature, since it sounds like several adopters
> > have implemented such automation. Perhaps one of them could contribute
> this
> > upstream.
> >
> > > Dynamic configuration file can not be updated automatically after some
> > > zookeeper servers of zk cluster are down
> > > ----------------------------------------------------------------------
> > > -----------------------------------------
> > >
> > >                 Key: ZOOKEEPER-3556
> > >                 URL:
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ZOOKEEPER-2D3556&d=DwIFaQ&c=ZmK7amRlbztwfC_NTU_hNw&r=bTmnMF5RGYcfg4qOcKQAYjkGGUtOB2jR22ryrk8hNWk&m=UNFnO3kfjtUL8Jievmh9VMXf_nTLKBCfuJsaxe6FshU&s=NQvX26JbBDNMmEtQhirmYk7ELe46vCjn4kbm1VqcNsA&e=
> > >             Project: ZooKeeper
> > >          Issue Type: Wish
> > >          Components: java client
> > >    Affects Versions: 3.5.5
> > >            Reporter: Steven Chan
> > >            Priority: Major
> > >   Original Estimate: 12h
> > >  Remaining Estimate: 12h
> > >
> > > *I encountered a problem which blocks my development of load balance
> > > using ZooKeeper 3.5.5.*
> > >    *Actually, I have a ZooKeeper cluster which comprises of five zk
> > > servers. And the dynamic configuration file is as follows:*
> > >  **
> > > {color:#FF0000}
> > > *server.1=zk1:2888:3888:participant;0.0.0.0:2181*{color}
> > > {color:#FF0000}
> > > *server.2=zk2:2888:3888:participant;0.0.0.0:2181*{color}
> > > {color:#FF0000}
> > > *server.3=zk3:2888:3888:participant;0.0.0.0:2181*{color}
> > > {color:#FF0000}
> > > *server.4=zk4:2888:3888:participant;0.0.0.0:2181*{color}
> > > {color:#FF0000}
> > > *server.5=zk5:2888:3888:participant;0.0.0.0:2181*{color}
> > >  **
> > >   *The zk cluster can work fine if every member works normally.
> > > However, if say two of them are suddenly down without previously being
> > > notified,* *the dynamic configuration file shown above will not be
> > > synchronized dynamically, which leads to the zk cluster fail to work
> > > normally.*
> > >   *As far as I am concerned, the dynamic configuration file should be
> > > modified to this if server 1 and server 5 are down suddenly as
> > > follows:* {color:#FF0000}
> > > *server.2=zk2:2888:3888:participant;0.0.0.0:2181*{color}
> > > {color:#FF0000}
> > > *server.3=zk3:2888:3888:participant;0.0.0.0:2181*{color}
> > > {color:#FF0000}
> > > *server.4=zk4:2888:3888:participant;0.0.0.0:2181*{color}
> > > *But in this case, the dynamic configuration file will never change
> > > automatically unless you manually revise it.*
> > >   *I think this is a very common case which may happen at any time. If
> > > so, how can we handle with it?*
> >
> >
> >
> > --
> > This message was sent by Atlassian Jira
> > (v8.3.4#803005)
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message