zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@yahoo-inc.com>
Subject Re: Zookeeper WAN Configuration
Date Tue, 28 Jul 2009 01:01:15 GMT
Todd, Some more answers. Please check out carefully the information at  
the bottom of this message.

On Jul 27, 2009, at 4:02 PM, Todd Greenwood wrote:

> I'm assuming that you're setting the weight of ZooKeeper servers in
> PODs to zero, which means that their votes when ordering updates do
> not count.
> [Todd] Correct.
> If my assumption is correct, then you should see a significant
> improvement in read performance. I would say that write performance
> wouldn't be very different from clients in PODs opening a direct
> connection to DC.
> [Todd] So the Leader, knowing that machine(s) have a voting weight  
> of zero, doesn't have to wait for their responses in order to form a  
> quorum vote? Does the leader even send voting requests to the weight  
> zero followers?

In the current implementation, it does. When we have observers  
implemented, the leader won't do it.

>> 3. ZK Servers within the POD would be resilient to network
>> connectivity failure between the POD and the DC. Once connectivity
>> re-established, the ZK Servers in the POD would sync with the ZK
>> servers in the DC, and, from the perspective of a client within the
>> POD, everything just worked, and there was no network failure.
> We want to have servers switching to read-only mode upon network
> partitions, but this is a feature under development. We don't have
> plans for implementing any model of eventual consistency that would
> allow updates even when not being able to form a quorum, and I
> personally believe that it would be a major change, with major
> implications not only to the code base, but also to the semantics of
> our API.
> [Todd] What is the current (3.2) behaviour in the case of a network  
> failure that prevents connectivity between ZK Servers in a pod?  
> Assuming the pod is composed of weight=0 followers...are the clients  
> connected to these zookeeper servers still able to read? do they get  
> exceptions on write? do the clients hang if it's a synchronous call?

The clients won't be able to read because we don't have this feature  
of going read-only upon partitions.

>> 4. A WAN topology of co-located ZK servers in both the DC and (n)
>> PODs would not significantly degrade the performance of the
>> ensemble, provided large blobs of traffic were not being sent across
>> the network.
> If the zk servers in the PODs are assigned weight zero, then I don't
> see a reason for having lower performance in the scenario you
> describe. If weights are greater than zero for zk servers in PODs,
> then your performance might be affected, but there are ways of
> assigning weights that do not require receiving votes from all co-
> locations for progress.
> [Todd] Great, we'll proceed with hierarchical configuration w/ ZK  
> Servers in pods having a voting weight of zero. Could you provide a  
> pointer to a configuration that shows this? The docs are a bit lean  
> in this regard...

We should have a twiki page on this. For now, you can find an example  
in the header of QuorumHierarchical.java.

Also, I found a couple of bugs recently that may or may not affect  
your setup, so I suggest that you apply the patches in ZOOKEEPER-481  
and ZOOKEEPER-479. We would like to have these patches in for the next  
release (3.2.1), which should be out in two or three weeks, if there  
is no further complication.

Another issue that I realized that won't work in your case, but the  
fix would be relatively easy, is the guarantee that no zero-weight  
follower will be elected. Currently, we don't check the weight during  
leader election. I'll open a jira and put up a patch soon.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message