hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Lukavsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4681) ProcfsBasedProcessTree should not calculate private clean pages
Date Wed, 10 Feb 2016 08:55:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140515#comment-15140515

Jan Lukavsky commented on YARN-4681:

[~cnauroth], I tested this patch against our jobs and it kind of helps, but doesn't solve
the whole problem. Another problem is that we see spikes of direct memory allocations (so
far I didn't track where exactly they come from), but it lead me to a thought, that it might
help not to calculate the exact memory consumption of a container, but to average it over
some time period (configurable, default zero, which would lead to the current behavior).

So, first I will modify the patch as you suggest (so that if the Locked field is missing,
then the behavior of the ProcfsBasedProcessTree will be exactly the same as before). I will
then try to add the time averaging and let you know if it helped.

Regarding the more aggresive strategies, I made some experiments and I don't think it would

> ProcfsBasedProcessTree should not calculate private clean pages
> ---------------------------------------------------------------
>                 Key: YARN-4681
>                 URL: https://issues.apache.org/jira/browse/YARN-4681
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 2.6.0, 2.7.0
>            Reporter: Jan Lukavsky
>         Attachments: YARN-4681.patch
> ProcfsBasedProcessTree in Node manager calculates memory used by a process tree by parsing
{{/etc/<pid>/smaps}}, where it calculates {{min(Pss, Shared_Dirty) + Private_Dirty +
Private_Clean}}. Because not {{mlocked}} private clean pages can be reclaimed by kernel, this
should be changed to calculating only {{Locked}} pages instead.

This message was sent by Atlassian JIRA

View raw message