hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Craig Welch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.
Date Fri, 08 May 2015 00:44:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533684#comment-14533684
] 

Craig Welch commented on YARN-1680:
-----------------------------------

bq. This requires when a node doing heartbeat with changed available resource, all apps blacklisted
the node need to be notified

Well, that's not quite so.  From what we were talking about, it means that the blacklist deduction
can't be a fixed amount but that it needs to be calculated by looking at the unused resource
of the blacklisted nodes during headroom calculation.  The rest of the above proposal for
detecting changes, etc, works, but instead of a static deduction value we would need a reference
to the blacklisted nodes for the app and look at their unused resources during the apps headroom
calculation, so there is that cost, but it's not related to the heartbeat or a notification
as such

bq. headroom for app could be under estimated

I think, generally, we should not take an approach which will underestimate/underutilize if
we have 6302 to fall back on.  If we don't, then we might want to do it only if we decide
not to do the accurate calculation in some cases based on limits (see immediately below),
but not as a matter of course.

bq. Only do accurate headroom calculation when there're not too much blacklisted nodes as
well as apps with blacklisted nodes.

I think if we put a limit on it, it should be a purely local decision, to only do the calculation
with < x blacklisted nodes for an app, which we would expect to rarely be an issue.  There
is a potential for performance issues here, but we don't really know how great a concern it
is.

bq.  MAPREDUCE-6302 is targeting to preempt reducer even if we reported inaccurate headroom
for apps. I think the approach looks good to me

I think that may work as a fallback option for MR, assuming it works out without issue, if
we decide to not do the proper headroom calculation in some cases, but that's MR specific
so it won't help non MR apps, and it has the issues I brought up before with performance degradation
vs the proper headroom calculation.  For these reasons I don't think it's a substitute for
fixing this issue overall, it may be a fallback option if we limit the cases where we do the
proper adjustment.

bq. Move headroom calculation to application side, I think now we cannot do it at least for
now...Application will only receive updated NodeReport from when node changes heathy status
instead of regular heartbeat

Well, in some sense that works OK for this because we really only need to know about those
changes in node status status wrt the blacklist to detect recalculation changes with the approach
proposed above.  The problem is that we will also need a way to query for current usage per
node while doing the calculation, I don't know if an efficient call for that exists (it would
ideally be batch for N nodes where we would ask for all the blacklisted nodes at once.)  There
is also the broader issue that we don't seem to have a single entry point client-side for
doing this right now, so we would need to touch a few points to add a library/something of
that nature to do this, and for AM's we may not be aware of/that are not part of the core,
they would have to potentially do some integration to get this.

> availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes
free memory.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-1680
>                 URL: https://issues.apache.org/jira/browse/YARN-1680
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>    Affects Versions: 2.2.0, 2.3.0
>         Environment: SuSE 11 SP2 + Hadoop-2.3 
>            Reporter: Rohith
>            Assignee: Craig Welch
>         Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, YARN-1680-v2.patch, YARN-1680.patch
>
>
> There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster slow start
is set to 1.
> Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is become
unstable(3 Map got killed), MRAppMaster blacklisted unstable NodeManager(NM-4). All reducer
task are running in cluster now.
> MRAppMaster does not preempt the reducers because for Reducer preemption calculation,
headRoom is considering blacklisted nodes memory. This makes jobs to hang forever(ResourceManager
does not assing any new containers on blacklisted nodes but returns availableResouce considers
cluster free memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message