zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Kou <bitana...@gmail.com>
Subject Re: Does Leader Election Have a "Settling" period?
Date Thu, 10 May 2012 23:45:05 GMT
I've been using Zookeeper 3.3.5 and the time needed to reconnect after
leader death seems to be close to 4 seconds. I think a large part of it is
due to the server nodes needing to confirm the death of their leader by
heartbeats.

Best Regards,
Martin Kou

On Thu, May 10, 2012 at 1:35 PM, Mark Gius <mgius7096@gmail.com> wrote:

> I'm doing some testing around a Client being connected to a zookeeper
> endpoint that goes away and I'm seeing what appears to be a "settling"
> period that is causing some errors.
>
> The test is as follows:
>
>  1) Three zookeeper servers are started up on the same host, configured to
> cluster with each other.
>  2) A Client is created and attaches to Server 1 (using
> deterministic_conn_order flag to force this)
>  3) Shut down Server 1 (which is NOT the Leader)
>  4) Servers 2 and 3 still have quorum.  Interruption of service should be
> minimal.
>  5) The Client _should_ reconnect immediately to Server 2 or 3.
>
> The behavior I am seeing in practice is that after shutting down Server 1
> quorum is lost and the Client takes on the order of 15-20 seconds to
> re-establish a connection to the cluster.  I do not see this behavior on a
> cluster that has existed for some time (say, 30-60 seconds).  I also do not
> see this problem on a cluster whose tickTime has been decreased to 100ms
> from the default of 2000ms.
>
> Is there a settling period that occurs immediately after a Leader is
> elected such that quorate changes during that time cause a full leader
> election when one might not otherwise be necessary?  If so, where can I
> find information about how this settling period behaves?
>
> I have uploaded the logs for each of the three zookeeper servers here:
> https://gist.github.com/2655709
>
> Mark
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message