zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chang Song <tru64...@me.com>
Subject Re: Serious problem processing hearbeat on login stampede
Date Thu, 14 Apr 2011 11:54:17 GMT

Patrick and Ted.
Unless Zookeeper clients adding this feature, it is not easy for us to implement.

We only provide platform for many services within our org.
Their batch servers will fire off whatever clients they want.
We have no control over it.

But 8 second latency during stampede is definitely a problem and
these needs to be addressed in server. Not client back-off policy.

What happens when we have double the more traffic?
Now we have more than 20 second latency? 

I think we need to rethink how heartbeat traffic are handled among all 
other request/response.

Thank you.

2011. 4. 14., 오후 2:24, Ted Dunning 작성:

> This is a more powerful idea than it looks like at first glance.
> The reason is that there is often a highly non-linear and adverse impact to
> response time due to higher load.  I have never been able to properly
> account for this using queuing models in a system that is not swapping, but
> it is definitely real.
> If your rebooting processes simply wait between 0 and 5 seconds, your
> problems are likely to be much better.
> 2011/4/13 Patrick Hunt <phunt@apache.org>
>> 2) can you hold off some of the clients from the stampede? Perhaps add
>> a random holdoff to each of the clients before connecting,
>> additionally a similar random holdoff from closing the session. this
>> seems like a straightforward change from your client side (easy to
>> implement/try) but hard to tell given we don't have much insight into
>> what your use case is.

View raw message