hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load
Date Fri, 06 Mar 2015 00:42:40 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349693#comment-14349693
] 

Arpit Agarwal commented on HADOOP-10597:
----------------------------------------

bq. We have been using "no server retry policy" option in production clusters for some time
and things are fine.
That's a good datapoint, thanks. Couple of questions:
# What kind of back-off policy are you using on the client? Do you have a separate Jira for
the client side work? I just realized today that we already have configurable retry policies
via {{DFS_CLIENT_RETRY_POLICY_SPEC_KEY}}. We also have a {{ExponentialBackoffRetry}} but it
doesn't look widely used. Did you use either of these?
# Did you use the same trigger on the server in production (RPC queue being full)?

> Evaluate if we can have RPC client back off when server is under heavy load
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10597
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HADOOP-10597-2.patch, HADOOP-10597-3.patch, HADOOP-10597-4.patch,
HADOOP-10597.patch, MoreRPCClientBackoffEvaluation.pdf, RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking state, assuming
OS connection doesn't run out. Alternatively RPC or NN can throw some well defined exception
back to the client based on certain policies when it is under heavy load; client will understand
such exception and do exponential back off, as another implementation of RetryInvocationHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message