hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nathan Roberts (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5356) ResourceUtilization should also include resource availability
Date Tue, 12 Jul 2016 13:47:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372905#comment-15372905

Nathan Roberts commented on YARN-5356:

bq. Nathan Roberts, I understand that your problem is that with the current approach you know
that you have 6 cores available to the NM and 4 of them are used. However, the machine is
not that utilized (~30%). Correct? In that case, we would only need to report the actual size
of the machine at registration time as it would never change. Not sure that ResourceUtilization
would be the right place for that as it would be reported in every heartbeat continuously.

[~elgoiri], Yep, that's exactly correct. I think reporting the physical capabilities of the
machine during registration should be ok. At least with linux it is technically possible for
the machine to change (e.g. echo 0 > /sys/devices/system/cpu/cpu3, OR memory gets automatically
removed because it's getting ECC errors, OR something reserves a bunch of memory for huge
pages, OR NIC re-negotiates from 10G to 1G), but I think these might be unusual enough that
we could ignore them. I originally suggested tweaking ResourceUtilization due to this small
chance of a physical resource changing but am happy to go either way. 

> ResourceUtilization should also include resource availability
> -------------------------------------------------------------
>                 Key: YARN-5356
>                 URL: https://issues.apache.org/jira/browse/YARN-5356
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager, resourcemanager
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Nathan Roberts
> Currently ResourceUtilization contains absolute quantities of resource used (e.g. 4096MB
memory used). It would be good if it also included how much of that resource is actually available
on the node so that the RM can use this data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered with (or
later updated using updateNodeResource). However, these aren't really sufficient to get a
good view of how utilized a resource is. For example, if a node reports 400% CPU utilization,
does that mean it's completely full, or barely utilized? Today there is no reliable way to
figure this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you have thoughts/opinions
on this?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message