Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E082B10AB8 for ; Wed, 7 Jan 2015 05:28:34 +0000 (UTC) Received: (qmail 83325 invoked by uid 500); 7 Jan 2015 05:28:35 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 83279 invoked by uid 500); 7 Jan 2015 05:28:35 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 83267 invoked by uid 99); 7 Jan 2015 05:28:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Jan 2015 05:28:35 +0000 Date: Wed, 7 Jan 2015 05:28:35 +0000 (UTC) From: "Ming Ma (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HADOOP-10597: ----------------------------- Attachment: HADOOP-10597-4.patch Thanks, [~stevel@apache.org]! Here is the new patch with your suggestions. Regarding the serialization of {{RetryAction}} via {{RetriableException}} message string, I agree it is not necessarily the best approach. Here we need to serialize RetryAction and have RPC server send it back to RPC client. Possible options that I know of: * Current RPC Header structure {{RpcHeaderProtos}} includes Exception message field; thus it is convenient to use {{RetriableException}} message. * We can consider adding optional {{RetryAction}} field into RPC header {{RpcHeaderProtos}}. That requires more changes. > Evaluate if we can have RPC client back off when server is under heavy load > --------------------------------------------------------------------------- > > Key: HADOOP-10597 > URL: https://issues.apache.org/jira/browse/HADOOP-10597 > Project: Hadoop Common > Issue Type: Sub-task > Reporter: Ming Ma > Assignee: Steve Loughran > Attachments: HADOOP-10597-2.patch, HADOOP-10597-3.patch, HADOOP-10597-4.patch, HADOOP-10597.patch, MoreRPCClientBackoffEvaluation.pdf, RPCClientBackoffDesignAndEvaluation.pdf > > > Currently if an application hits NN too hard, RPC requests be in blocking state, assuming OS connection doesn't run out. Alternatively RPC or NN can throw some well defined exception back to the client based on certain policies when it is under heavy load; client will understand such exception and do exponential back off, as another implementation of RetryInvocationHandler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)