zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rakesh R <rake...@huawei.com>
Subject RE: adding a separate thread to detect network timeouts faster
Date Fri, 13 Sep 2013 06:24:14 GMT
>>>>>> I think this can be done purely on the client side. Create a separate
thread that sends a 4 letter word command like ruok periodically, and close the socket if
the client doesn't get the response within certain amount of time.


Thanks Michi for pointing to '4 letter word command'. 

I would like to add one point where we have more number of clients(which was mentioned in
below mail threads), say 50,000 clients and the heartbeat interval is 2secs. With this ruok
approach, this would have the overhead of establishing socket connections, if each client
is sending ruok command to respective server. Instead of sending heartbeat from each zkclient
session, the clientcnxn side logic can do sending heartbeat from each host and update the
status to all the clients that has created from that host. Any thoughts?

-Rakesh

-----Original Message-----
From: mutsuzaki@gmail.com [mailto:mutsuzaki@gmail.com] On Behalf Of Michi Mutsuzaki
Sent: 12 September 2013 23:35
To: Rakesh R
Cc: user@zookeeper.apache.org; German Blanco
Subject: Re: adding a separate thread to detect network timeouts faster

On Thu, Sep 12, 2013 at 12:05 AM, Rakesh R <rakeshr@huawei.com> wrote:
> AFAIK, ping requests would not involve any disk I/O, but it would go through the RequestProcessor
chain and executes sequentially.

Yes, that's what I meant. Ping requests don't touch disk, but they do go through the commit
processor. So if a ping request is behind a write operation that takes a long time, the ping
request will be affected. This is done intentionally to take the disk into account for the
heartbeat mechanism.

Anyways, I misunderstood what Jeremy was proposing. He wants to keep the session timeout relatively
high to tolerate slow disk, but at the same time detect non-disk failure (node down, network
partition) more quickly.

I think this can be done purely on the client side. Create a separate thread that sends a
4 letter word command like ruok periodically, and close the socket if the client doesn't get
the response within certain amount of time.

Mime
View raw message