hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miklos Szegedi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7693) ContainersMonitor support configurable
Date Wed, 03 Jan 2018 21:02:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-7693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310276#comment-16310276

Miklos Szegedi commented on YARN-7693:

Thank you for the reply [~yangjiandan].
+0 on the approach adding a separarate monitor class for this. I think it is useful to be
able to change the monitor.
In terms of the feature you described I have some suggestions, you may want to consider.
First of all please consider using a JIRA feature for your project making this as a sub-task.
How about doing this as part of YARN-1747 or even better YARN-1011?
You may want to leverage the option to simply turn off the current cgroups memory enforcement
using the configuration added in YARN-7064. It also handles monitoring resource utilization
using cgroups.
bq. 1) Separate containers into two different group Opportunistic_Group and Guaranteed_Group
under hadoop-yarn
The reason why it is useful to have a single cgroup hadoop-yarn for all containers that you
can set a single logic and control the OOM killer for all. I would be happy to look at the
actual code, but adjusting two different cgroups may add too much complexity. It is especially
problematic in case of promotion. When an opportunistic container is promoted to guaranted,
you need to move to the other cgroup but this requires heavy lifting from the kernel that
takes significant time. See https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
for details.
bq. 2) Monitor system resource utilization and dynamically adjust resource of Opportunistic_Group
The concern here is that dynamically adjusting does not work in the current implementation
either. This is because it is too slow to respond in extreme cases. Please check out YARN-6677,
YARN-4599 and YARN-1014. The idea there is to disable the OOM killer on hadoop-yarn as you
also suggested, so that we get notified by the kernel when the system resource utilization
is low. YARN can then decide which container to preempt or adjust the soft limit, while the
containers are paused. The preemption unblocks the containers. Please let us know, if you
have time and you would like to contribute.
bq. 3) Kill container only when adjust resource fail for given times
I absolutely agree with this. A sudden spike in cpu usage should not trigger immediate preemption.
In case of memory I am not sure how much you can adjust though. My understanding is that the
basic design of opportunistic containers is that they never affect the performance of guaranteed
ones but using IO for swapping would exactly do that. How would you reduce memory usage if
not preempting?

> ContainersMonitor support configurable
> --------------------------------------
>                 Key: YARN-7693
>                 URL: https://issues.apache.org/jira/browse/YARN-7693
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Jiandan Yang 
>            Assignee: Jiandan Yang 
>            Priority: Minor
>         Attachments: YARN-7693.001.patch, YARN-7693.002.patch
> Currently ContainersMonitor has only one default implementation ContainersMonitorImpl,
> After introducing Opportunistic Container, ContainersMonitor needs to monitor system
metrics and even dynamically adjust Opportunistic and Guaranteed resources in the cgroup,
so another ContainersMonitor may need to be implemented. 
> The current ContainerManagerImpl ContainersMonitorImpl direct new ContainerManagerImpl,
so ContainersMonitor need to be configurable.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message