hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-8232) RMContainer lost queue name when RM HA happens
Date Tue, 22 May 2018 16:08:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jason Lowe updated YARN-8232:
    Fix Version/s: 2.8.5

Thanks, [~ziqian hu]!  We recently ran into the same issue on 2.8 as well, so I committed
this to branch-3.0, branch-2, branch-2.9, and branch-2.8.

> RMContainer lost queue name when RM HA happens
> ----------------------------------------------
>                 Key: YARN-8232
>                 URL: https://issues.apache.org/jira/browse/YARN-8232
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.8.3
>            Reporter: Hu Ziqian
>            Assignee: Hu Ziqian
>            Priority: Major
>             Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 2.8.5
>         Attachments: YARN-8232-branch-, YARN-8232.001.patch, YARN-8232.002.patch,
> RMContainer has a member variable queuename to store which queue the container belongs
to. When RM HA happens and RMContainers are recovered by scheduler based on NM reports, the
queue name isn't recovered and always be null.
> This situation causes some problems. Here is a case in preemption. Preemption uses container's
queue name to deduct preemptable resources when we use more than one preempt selector, (for
example, enable intra-queue preemption,) . The detail is in
> {code:java}
> CapacitySchedulerPreemptionUtils.deductPreemptableResourcesBasedSelectedCandidates(){code}
> If the contain's queue name is null, this function will throw a YarnRuntimeException
because it tries to get the container's TempQueuePerPartition and the preemption fails.
> Our patch solved this problem by setting container queue name when recover containers.
The patch is based on branch-2.8.3.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message