hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaoyu Yao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15016) Add reservation support to RPC FairCallQueue
Date Fri, 03 Nov 2017 17:52:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238075#comment-16238075
] 

Xiaoyu Yao commented on HADOOP-15016:
-------------------------------------

[~ywskycn], Thanks for the heads up. This looks like an interesting proposal. 

bq. However, this has some limitations when a cluster is shared among both end-users and some
service jobs, like some ETL jobs which run under a service account and need to issue lots
of RPC calls. 

Have you looking into RPC CallerID (HDFS-9184) that is designed to trace callers under different
services (Yarn/Spark/Hive/Tez). You could extend a IdentifyProvider to leverage that and thus
avoid punishing all the RPC calls from the same service user.

bq. One idea here is to introduce reservation support to RPC resources.
Can you elaborate on how to quantify the cost of RPC calls, which are not equal in terms of
the cost on NN? Same RPC call with different parameter may have significant difference in
cost as well. Can you post more details of the proposal for discussion.
  

> Add reservation support to RPC FairCallQueue
> --------------------------------------------
>
>                 Key: HADOOP-15016
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15016
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Wei Yan
>            Assignee: Wei Yan
>            Priority: Normal
>
> FairCallQueue is introduced to provide RPC resource fairness among different users. In
current implementation, each user is weighted equally, and the processing priority for different
RPC calls are based on how many requests that user sent before. This works well when the cluster
is shared among several end-users.
> However, this has some limitations when a cluster is shared among both end-users and
some service jobs, like some ETL jobs which run under a service account and need to issue
lots of RPC calls. When NameNode becomes quite busy, this set of jobs can be easily backoffed
and low-prioritied. We cannot simply treat this type jobs as "bad" user who randomly issues
too many calls, as their calls are normal calls. Also, it is unfair to weight a end-user and
a heavy service user equally when allocating RPC resources.
> One idea here is to introduce reservation support to RPC resources. That is, for some
services, we reserve some RPC resources for their calls. This idea is very similar to how
YARN manages CPU/memory resources among different resource queues. A little more details here:
Along with existing FairCallQueue setup (like using 4 queues with different priorities), we
would add some additional special queues, one for each special service user. For each special
service user, we provide a guarantee RPC share (like 10% which can be aligned with its YARN
resource share), and this percentage can be converted to a weight used in WeightedRoundRobinMultiplexer.
A quick example, we have 4 default queues with default weights (8, 4, 2, 1), and two special
service users (user1 with 10% share, and user2 with 15% share). So finally we'll have 6 queues,
4 default queues (with weights 8, 4, 2, 1) and 2 special queues (user1Queue weighted 15*10%/75%=2,
and user2Queue weighted 15*15%/75%=3).
> For new coming RPC calls from special service users, they will be put directly to the
corresponding reserved queue; for other calls, just follow current implementation.
> By default, there is no special user and all RPC requests follow existing FairCallQueue
implementation.
> Would like to hear more comments on this approach; also want to know any other better
solutions? Will put a detailed design once get some early comments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message