zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [zookeeper] lvfangmin opened a new pull request #843: ZOOKEEPER-3296: Explicitly closing the sslsocket when it failed handshake to prevent issue where peers cannot join quorum
Date Thu, 07 Mar 2019 18:39:54 GMT
lvfangmin opened a new pull request #843: ZOOKEEPER-3296: Explicitly closing the sslsocket
when it failed handshake to prevent issue where peers cannot join quorum
URL: https://github.com/apache/zookeeper/pull/843
 
 
   The quorum connection manager is handling connections sequentially with a default listen
backlog queue size 50, during the network loss, there are socket read timed out, which is
syncLimit * tickTime, and almost all the following connect requests in the backlog queue will
timed out from the other side before it's being processed. 
   
   Those timed out learners will try to connect to a different server, and leaves the connect
requests on server side without sending the close_notify packet. The server is slowly consuming
from these queue with syncLimit * tickTime timeout for each of those requests which haven't
sent notify_close packet. Any new connect requests will be queued up again when there is spot
in the listen backlog queue, but timed out before the server handles it, and it can never
successfully finish any new connection, and it failed to join the quorum.
   
   Please check the Jira for more details.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message