zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Alexandar.Gvozdeno...@ubs.com>
Subject Possible issue with cluster availability following new Leader Election - ZK 3.4
Date Wed, 09 May 2012 13:27:07 GMT

Hi Zookeeper devs and users, 


I've been doing some load and failover testing on the ZK 3.4 branch
using moderately large data sets (700mb and 20k nodes) and I think there
could be an issue. 

When I bring down the leader of a 3 node cluster, it takes around 20-30
seconds for the cluster as a whole to become available again. 
This is because once a new leader is elected it pushes out a snapshot to
all the peers who in turn persist it locally before sending an ack back.
Only then does the leader decide he has a valid quorum. In this case
pretty much all the time is taken up sending the data over the network
and re-saving it. 

Granted I'm testing this on some low-spec VM's so I wouldn't expect a
real-world sync for a data set that size to take anything like as long.
However is this not a significant constraint on availability if,
whenever a leader fails, a full snapshot needs to sent to and persisted
by a quorum of peers before the cluster as a whole can be deemed as
available? 

I notice when a peer joins a stable cluster as a follower,
synchronization is implemented via diffs and the peer is quickly
available for client connections provided it already had an up to date
local state. 
Should not something similar not be possible when a new leader is
elected. A quick glance at the code (line 390 of LearnerHandler)
suggests there is some logic to send an empty diff but I never see this
triggered.

I'm am not mutating any state in the cluster whilst I am bringing stuff
up and down so is this behaviour a bug or by design?

I saw a related question
(http://zookeeper-user.578899.n2.nabble.com/leader-election-length-td708
6868.html#a7089472) a few months back that touched on this, but there
was not much follow up. 

Many thanks

Alex





Visit our website at http://www.ubs.com 

This message contains confidential information and is intended only 
for the individual named. If you are not the named addressee you 
should not disseminate, distribute or copy this e-mail. Please 
notify the sender immediately by e-mail if you have received this 
e-mail by mistake and delete this e-mail from your system. 

E-mails are not encrypted and cannot be guaranteed to be secure or 
error-free as information could be intercepted, corrupted, lost, 
destroyed, arrive late or incomplete, or contain viruses. The sender 
therefore does not accept liability for any errors or omissions in the 
contents of this message which arise as a result of e-mail transmission. 
If verification is required please request a hard-copy version. This 
message is provided for informational purposes and should not be 
construed as a solicitation or offer to buy or sell any securities 
or related financial instruments. 

UBS Limited is a company limited by shares incorporated in the United 
Kingdom registered in England and Wales with number 2035362. 
Registered office: 1 Finsbury Avenue, London EC2M 2PP.  UBS Limited 
is authorised and regulated by the Financial Services Authority. 

UBS AG is a public company incorporated with limited liability in 
Switzerland domiciled in the Canton of Basel-City and the Canton of 
Zurich respectively registered at the Commercial Registry offices in 
those Cantons with Identification No: CH-270.3.004.646-4 and having 
respective head offices at Aeschenvorstadt 1, 4051 Basel and 
Bahnhofstrasse 45, 8001 Zurich, Switzerland.  Registered in the 
United Kingdom as a foreign company with No: FC021146 and having a 
UK Establishment registered at Companies House, Cardiff, with No:  
BR 004507.  The principal office of UK Establishment: 1 Finsbury Avenue, 
London EC2M 2PP.  In the United Kingdom, UBS AG is authorised and 
regulated by the Financial Services Authority.

UBS reserves the right to retain all messages. Messages are protected 
and accessed only in legally justified cases. 

Mime
View raw message