Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@zookeeper.apache.org
From: Powell Molleti <pmolleti@vmware.com>
To: "user@zookeeper.apache.org" <user@zookeeper.apache.org>
Subject: Re: quorum connection manager shutdown takes long time
Thread-Topic: quorum connection manager shutdown takes long time
Thread-Index: AQHQ5DTWskNMyvTcokmFXnrOlg2zQA==
Date: Mon, 31 Aug 2015 21:34:38 +0000
Message-ID: <D20A167E.120C%pmolleti@vmware.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative;
	boundary="_000_D20A167E120Cpmolletivmwarecom_"
MIME-Version: 1.0

--_000_D20A167E120Cpmolletivmwarecom_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

In reference to:
https://issues.apache.org/jira/browse/ZOOKEEPER-2246

Plainly removing  sock.setSoTimeout(0) from http://s.apache.org/TfI has the=
 unintended consequence of shutting down both the RecvWorker and SendWorker=
 threads for all cases. Seems like current code is designed to  keep the so=
cket alive (and threads to keep running) so as to reuse this channel to com=
municate again with the the peer node which still alive but needs to redo l=
eader election.

I could not reproduce any issue if threads shutdown after the timeout since=
 new threads are created for next iteration of leader election. I rather wo=
uld like to reuse the threads and the channel hence I propose the following=
 approach.

The alternative I suggest is to still remove setSoTimeout(0) from here: htt=
p://s.apache.org/TfI  , also enable SO_KEEPALIVE via setKeepAlive() on this=
 socket and do not consider it an error when timeout occurs here: http://bi=
t.ly/1JHIdVY but consider it an error when it happens here: http://bit.ly/1=
NTjQ9R

This means that users can play with keep alive timeouts for TCP sockets to =
quicken TCP socket failures propagating to user-space and zookeeper also re=
sets the socket if it detects other side is not responding when it knows it=
 needs a response within some bounded time.

Ideally I wish there is some userspace pings of every socket channel betwee=
n zookeeper nodes to detect dead channels quickly. Seems like one exists fo=
r sockets that do Follow/Lead after leader election is done but not for thi=
s?. Such a feature could be added with care towards making it backward comp=
atible.

I posted the above text to Jira. Also please point out any wrong assumption=
s I have made and provide comments and suggestions.

Thanks
Powell.


> From Ra=FAl Guti=E9rrez Segal=E9s <...@itevenworks.net>
> Subject Re: quorum connection manager shutdown takes long time
> Date Thu, 10 Jul 2014 18:02:37 GMT
> On 9 July 2014 08:28, Michi Mutsuzaki <michi@cs.stanford.edu> wrote:

>> I don't know how I missed that :) QA said this is reproducible, so
>> I'll try commenting this line out. Thanks Flavio!
>>

> I am curious, was it that?
> -rgs


--_000_D20A167E120Cpmolletivmwarecom_--