zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Mulcahy <james_mulc...@apple.com>
Subject Re: Unbalanced client connections
Date Fri, 04 Jul 2014 13:36:26 GMT
On 4 Jul 2014, at 14:16, Flavio Junqueira <fpjunqueira@yahoo.com.INVALID> wrote:

> Ok, so a couple of obvious checks

Sure…

> - Are you passing a connection string with all five servers?

Yes, most definitely.  Prior to deployment of this I did some extensive testing where I killed
off ZK servers randomly to test our clients’ ability to reconnect on to another server in
the cluster.  I know that if they absolutely need to, they can connect elsewhere — but the
graphs show they almost always pick the same server.

> - Are you calling zoo_deterministic_conn_order(1) by any chance (you shouldn't if you
want shuffling)?

No, I wasn’t aware of that function — but in mentioning it you’ve led me to the code
that does the shuffling.  Is there anything on the server side to force a client to move elsewhere
if the server has a disproportional number of the clients connected to it?  That’s the function
I though I had read exists?  That said, given a sufficiently random random() function, it
looks like the permute should do enough to stop all clients arriving on the same server initially
anyway.

Perhaps I’ll need to add some instrumentation dump out the permuted connection list and
see how it varies across the clients?

—James

> -Flavio
> 
> 
> On Friday, July 4, 2014 2:01 PM, James Mulcahy <james_mulcahy@apple.com> wrote:
> 
> 
>> 
>> 
>> 
>> Hi Flavio,
>> 
>> Thanks for the quick response — and apologies for not including these details up
front!
>> 
>> - C client binding
>> - 99.99% MacOS X Clients (10.9.2), with a couple of Linux Clients (Ubuntu 14.04)
>> - All ZK nodes are Linux (Ubuntu 14.404)
>> - ZooKeeper 3.4.6
>> 
>> No Windows involved here….
>> 
>> —James
>> 
>> 
>> On 4 Jul 2014, at 13:57, Flavio Junqueira <fpjunqueira@yahoo.com.INVALID> wrote:
>> 
>>> Hi James,
>>> 
>>> Are you using the C or the Java client binding? What's the OS? I'm asking because
there is an issue with the randomization of the connect string on Windows we found, but I
haven't created a jira for it yet.
>>> 
>>> -Flavio 
>>> 
>>> 
>>> On Friday, July 4, 2014 10:41 AM, James Mulcahy <james_mulcahy@apple.com>
wrote:
>>> 
>>> 
>>>> 
>>>> 
>>>> 
>>>> Hello,
>>>> 
>>>> I run a 5 node ZooKeeper ensemble, with ~900 clients connected at a given
time.  I’m noticing that at any one point in time, all the clients are generally connected
to the same ZooKeeper node.
>>>> 
>>>> Looking back over the graphs I have which track this, there has only been
one brief period where one node didn’t have >90% of the clients; and during that period,
two nodes shared roughly 50% of the clients each.
>>>> 
>>>> Is this expected behaviour?  Is there anything I can do to tune this, to
encourage the clients to be more balanced?
>>>> 
>>>> My expectation was that the clients would self-balance — I thought I’d
read that somewhere in the documentation, but I can’t find a reference for that now.
>>>> 
>>>> Thanks in advance,
>>>> 
>>>> —James
>>>> 
>> 
>> 


Mime
View raw message