hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-9229) IPC: Retry on connection reset or socket timeout during SASL negotiation
Date Fri, 18 Jan 2013 16:02:14 GMT
Kihwal Lee created HADOOP-9229:

             Summary: IPC: Retry on connection reset or socket timeout during SASL negotiation
                 Key: HADOOP-9229
                 URL: https://issues.apache.org/jira/browse/HADOOP-9229
             Project: Hadoop Common
          Issue Type: Improvement
          Components: ipc
    Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.7
            Reporter: Kihwal Lee

When an RPC server is overloaded, incoming connections may not get accepted in time, causing
listen queue overflow. The impact on client varies depending on the type of OS in use. On
Linux, connections in this state look fully connected to the clients, but they are without
buffers, thus any data sent to the server will get dropped.

This won't be a problem for protocols where client first wait for server's greeting. Even
for clients-speak-first protocols, it will be fine if the overload is transient and such connections
are accepted before the retransmission of dropped packets arrive. Otherwise, clients can hit
socket timeout after several retransmissions.  In certain situations, connection will get
reset while clients still waiting for ack.

We have seen this happening to IPC clients during SASL negotiation. Since no call has been
sent, we should allow retry when connection reset or socket timeout happens in this stage.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message