hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5333) Some recovered apps are put into default queue when RM HA
Date Thu, 21 Jul 2016 09:25:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387427#comment-15387427

Sunil G commented on YARN-5333:

Hi [~hex108]
Thanks for working on this patch. I have few doubts on the test setup of your while testing
with CS.

bq.Without the patch, apps that submitted to new added queues will be killed, the diagnostics
message is "Application killed on recovery as it was submitted to queue c which no longer
exists after restart.".
While you added a new queue, have you performed "yarn rmadmin refreshQueues" command. This
is to ensure the changed queue topology is refreshed. *Note*: CS doesnt have an auto refresh
like Fair. Also if you were not using something like {{FileSystemBasedConfigurationProvider}},
i think you have update the same change configuration for queue change in both nodes. At this
point of time, if you do any HA, you wont be getting this issue.
Could you please help to confirm once. and pls correct me if I missed some steps which you
may have done.

> Some recovered apps are put into default queue when RM HA
> ---------------------------------------------------------
>                 Key: YARN-5333
>                 URL: https://issues.apache.org/jira/browse/YARN-5333
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jun Gong
>            Assignee: Jun Gong
>         Attachments: YARN-5333.01.patch, YARN-5333.02.patch, YARN-5333.03.patch
> Enable RM HA and use FairScheduler, {{yarn.scheduler.fair.allow-undeclared-pools}} is
set to false, {{yarn.scheduler.fair.user-as-default-queue}} is set to false.
> Reproduce steps:
> 1. Start two RMs.
> 2. After RMs are running, change both RM's file {{etc/hadoop/fair-scheduler.xml}}, then
add some queues.
> 3. Submit some apps to the new added queues.
> 4. Stop the active RM, then the standby RM will transit to active and recover apps.
> However the new active RM will put recovered apps into default queue because it might
have not loaded the new {{fair-scheduler.xml}}. We need call {{initScheduler}} before start
active services or bring {{refreshAll()}} in front of {{rm.transitionToActive()}}. *It seems
it is also important for other scheduler*.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message