hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3446) FairScheduler HeadRoom calculation should exclude nodes in the blacklist.
Date Thu, 17 Sep 2015 15:37:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803100#comment-14803100
] 

Karthik Kambatla commented on YARN-3446:
----------------------------------------

Thanks for rebasing the patch, [~zxu]. Comments:

FSAppAttempt:
# How about using a helper method {{subtractResourcesOnBlacklistedNodes}} instead of adding
all the logic to {{getHeadroom}} itself?
# Is the optimization to get the blacklist only when it has changed necessary? Looks like
we optimize the fetch, but not the iteration on it. I think we should either go all the way
and optimize iterating on the blacklist nodes as well only when the blacklist has changed,
or leave out the optimization until we see a need for it. 
# To get the blacklist, can't we just use {{AppSchedulingInfo#getBlacklist}} (needs synchronization)
or {{AppSchedulingInfo#getBlacklistCopy}}? Do we need the methods in the scheduler? 

If we make these changes, we might not need all the changes in rest of the files.


> FairScheduler HeadRoom calculation should exclude nodes in the blacklist.
> -------------------------------------------------------------------------
>
>                 Key: YARN-3446
>                 URL: https://issues.apache.org/jira/browse/YARN-3446
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>         Attachments: YARN-3446.000.patch, YARN-3446.001.patch, YARN-3446.002.patch
>
>
> FairScheduler HeadRoom calculation should exclude nodes in the blacklist.
> MRAppMaster does not preempt the reducers because for Reducer preemption calculation,
headRoom is considering blacklisted nodes. This makes jobs to hang forever(ResourceManager
does not assign any new containers on blacklisted nodes but availableResource AM get from
RM includes blacklisted nodes available resource).
> This issue is similar as YARN-1680 which is for Capacity Scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message