hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Curino (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3481) Report NM aggregated container resource utilization in heartbeat
Date Fri, 17 Apr 2015 00:24:59 GMT

    [ https://issues.apache.org/jira/browse/YARN-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499037#comment-14499037

Carlo Curino commented on YARN-3481:

Creating a new ResourceUtilization class, I think would be particularly relevant if we start
tracking more resources that YARN enforces. I.e., if YARN only enforces <RAM,CPU> and
we care to monitor: disk queues, disk bandwidth for writs/reads, disk IOPS, network bandwidth,
CPU IO-wait-time, etc.. etc.. than a new object is probably a good way to go. 

If the set of resources we will monitor and enforce is the same, I would vote for evolving
Resource to express everything as doubles (also RAM). I stumble on limitations of Resources
in the context of the "reservation" work, where I was tracking cpu-over-time and running out
of range of Integer (e.g., counting memory over time for large reservations). This would allow
us to simplify that code too (removing local classes used only to handle integral resources).

> Report NM aggregated container resource utilization in heartbeat
> ----------------------------------------------------------------
>                 Key: YARN-3481
>                 URL: https://issues.apache.org/jira/browse/YARN-3481
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager, resourcemanager
>    Affects Versions: 2.7.0
>            Reporter: Inigo Goiri
>            Priority: Minor
>   Original Estimate: 336h
>  Remaining Estimate: 336h
> To allow the RM take better scheduling decisions, it should be aware of the actual utilization
of the containers. The NM would aggregate the ContainerMetrics and report it in every heartbeat.
> Related to YARN-1012 but aggregated to reduce the heartbeat overhead.

This message was sent by Atlassian JIRA

View raw message