hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load
Date Sun, 21 Dec 2014 18:15:14 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14255234#comment-14255234
] 

Steve Loughran commented on HADOOP-10597:
-----------------------------------------

-sorry, browser submitted too early.

Looks OK to me, though its gone deep enough into the RPC stack I'm out of my depth. 

Minor recommendations
* tag things as audience=private as well as unstable
* {{LinearClientBackoffPolicy}} p 46-49, can we pull these inline strings out as public constants?
It keeps errors down in tests & other code setting things.
* {{NullClientBackoffPolicy}} should just extend {{Configured}} to remove boiler plate set/get
conf logic
* TestRPC.testClientBackOff(). Recommend saving any caught IOE and, if !succeeded, rethrowing
it. It'll help debugging failing tests.

One thing I will highlight is I.m not that enamoured of how the retriable exception protobuf
data is being marshalled into the string value of the exception. Why choose this approach?


> Evaluate if we can have RPC client back off when server is under heavy load
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10597
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>            Assignee: Steve Loughran
>         Attachments: HADOOP-10597-2.patch, HADOOP-10597-3.patch, HADOOP-10597.patch,
MoreRPCClientBackoffEvaluation.pdf, RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking state, assuming
OS connection doesn't run out. Alternatively RPC or NN can throw some well defined exception
back to the client based on certain policies when it is under heavy load; client will understand
such exception and do exponential back off, as another implementation of RetryInvocationHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message