hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Inigo Goiri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5215) Scheduling containers based on load in the servers
Date Wed, 08 Jun 2016 20:14:21 GMT

    [ https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321355#comment-15321355
] 

Inigo Goiri commented on YARN-5215:
-----------------------------------

This could be a subtask in YARN-1011. However, I thought it was isolated enough to make it
independent. I think YARN-5202 also goes into the same overcommit direction. To clarify the
differences to YARN-1011: here we propose to estimate the utilization from external processes
(e.g., HDFS DataNode or Impala daemons) and schedule based on that, so I think is orthogonal
to the overcommit work. Happy to move it to a subtask if you guys think it makes more sense
there.

Regarding the questions from [~curino]:
# This is just at scheduling time and with the proposed approach we will allocate the same
or less as the current schedulers; so it's conservative in terms of scheduling and the only
issue is it wouldn't use as many resources as it could. The estimation is: {{externalUtilization
= nodeUtilization - containersUtilization}} and given that the container and node utilization
are captured at different intervals, we could have containersUtilization > nodeUtilization;
I think adding a check for negative values should be enough. In any case, I don't see issues
with enforcing the resources as the only thing we do is estimating the external utilization.
# This patch only prevents scheduling, further discussion in #4.
# As I mentioned in the first paragraph, I think this is orthogonal to overcommit, we can
run this without overcommiting resources and just prevent impact on the external load. If
overcommitting is enabled, we can still play this trick.
# For the functionality described in this patch, this is it; we can open other tasks to do
preemption at (1) scheduler level and (2) NM level. We could add the first one to this patch
if needed.
# We have variations of this implemented and running in our cluster for the last year. In
our scenario, we have other latency sensitive load running in those machines, and we want
to guarantee they get as many resources as they need. Regarding unit testing, I can try to
play with the MiniYarnCluster to fake external load; it shouldn't be too bad to extend {{TestMiniYarnClusterNodeUtilization}}.

(I tried to use bq to reply but it got messy, I hope this is comprehensive.)

Just to highlight my point on the major discussion, I think this can be a subtask of YARN-1011
but it's orthogonal to overcommit.

> Scheduling containers based on load in the servers
> --------------------------------------------------
>
>                 Key: YARN-5215
>                 URL: https://issues.apache.org/jira/browse/YARN-5215
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Inigo Goiri
>         Attachments: YARN-5215.000.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the resources.
The proposal is to use the utilization information in the node and the containers to estimate
how much is actually available in the NMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message