hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load
Date Mon, 13 Oct 2014 18:21:35 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169689#comment-14169689
] 

Chris Li commented on HADOOP-10597:
-----------------------------------

Hi [~mingma], thanks for adding some numbers. If I understand correctly from the graph, the
latency spike is a result of maxing out the call queue's capacity, which FairCallQueue will
not solve since FCQ has no choice but to enqueue a call somewhere. Just to double check, were
all these calls made under the same user? I'd guess that RPC client backoff would work just
as well when FairCallQueue is disabled too, since it solves the different problem of alleviating
a full queue. I do agree with Steve that we'll want some fuzz on the retry method, since linear
could cause load to be periodic over time


> Evaluate if we can have RPC client back off when server is under heavy load
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10597
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HADOOP-10597-2.patch, HADOOP-10597.patch, MoreRPCClientBackoffEvaluation.pdf,
RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking state, assuming
OS connection doesn't run out. Alternatively RPC or NN can throw some well defined exception
back to the client based on certain policies when it is under heavy load; client will understand
such exception and do exponential back off, as another implementation of RetryInvocationHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message