hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8820) Enable RPC Congestion control by default
Date Thu, 30 Jul 2015 04:33:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647162#comment-14647162

Ming Ma commented on HDFS-8820:

Thanks [~arpitagarwal]. Should we enable this for communication between DN and NN? It appears
RetriableException is only supported by FailoverOnNetworkExceptionRetry used by client for
NN HA scenario; DN doesn't use that retry policy when it communicates with NN. In our clusters,
we configure service port on NN so DN RPCs go to the service RPC server and backoff isn't
enabled on that service RPC server. We can have DN use retry policy that supports RetriableException;
but that will require extra work.

For the configuration part, I wonder if we should use the pattern similar to RPC's {{setProtocolEngine}},or
{{ipc.server.read.threadpool.size}} where NN or other services can call {{RPC.Builder#setnumReaders}}
to override the value. In that way, the NN doesn't need to know the format of the configuration
key name. 

> Enable RPC Congestion control by default
> ----------------------------------------
>                 Key: HDFS-8820
>                 URL: https://issues.apache.org/jira/browse/HDFS-8820
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>         Attachments: HDFS-8820.01.patch, HDFS-8820.02.patch
> We propose enabling RPC congestion control introduced by HADOOP-10597 by default.
> We enabled it on a couple of large clusters a few weeks ago and it has helped keep the
namenodes responsive under load.

This message was sent by Atlassian JIRA

View raw message