hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-14034) Allow ipc layer exceptions to selectively close connections
Date Fri, 03 Feb 2017 18:57:52 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Daryn Sharp updated HADOOP-14034:
    Attachment: HADOOP-14034-branch-2.patch

Only difference in patches is context conflicts with comments.

RpcServerExceptions used to be caught in processOneRpc, sent to client unconditionally as
FATAL, rethrown, caught in doRead, connection closed.  This precludes any none fatal exceptions
in the readers.

Basic changes:
* Existing reader level exceptions are and continue to be fatal.  Renamed WrappedRpcServerException
to FatalRpcServerException - to better reflect what it is - and overrode getRpcStatusProto
to return FATAL.
* processOneRpc catches RpcServerExceptions and passes the exception’s RpcStatusProto, instead
of using hardcoded FATAL, to setupResponse.
* setupResponse will mark the connection “shouldClose” if RpcStatusProto is FATAL.
* readAndProcess and SASL’s unwrapPacketAndProcessRpcs will abort if shouldClose is set.
* processOneRpc does not rethrow handled RpcServerExceptions to prevent doRead from closing
the connection.
* doRead continues to close the connection if an exception is caught, which is now only internal
server errors or client read/write failures.
* doRead now closes if shouldClose is true - per above points, is set when setupResponse is
called with a FATAL exception.

This paves the way to enable more intelligent call queue backoff polices - such as not disconnecting
the vast majority of “good clients” when overflow occurs.

> Allow ipc layer exceptions to selectively close connections
> -----------------------------------------------------------
>                 Key: HADOOP-14034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14034
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ipc
>    Affects Versions: 2.7.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HADOOP-14034-branch-2.patch, HADOOP-14034-trunk.patch
> IPC layer exceptions generated in the readers are translated into fatal errors - resulting
in connection closure.  Ex. RetriableExceptions from call queue pushback.
> Always closing the connection degrades performance for all clients since a disconnected
client will immediately reconnect on retry.  Readers become overwhelmed servicing new connections
and re-authentications from bad clients instead of servicing calls from good clients.  The
call queues run dry.
> Exceptions originating in the readers should be able to indicate if the exception is
an error or fatal so connections can remain open.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message