hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hung (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-7252) Removing queue then failing over results in exception
Date Thu, 28 Sep 2017 02:49:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Hung updated YARN-7252:
--------------------------------
    Fix Version/s: YARN-5734

> Removing queue then failing over results in exception
> -----------------------------------------------------
>
>                 Key: YARN-7252
>                 URL: https://issues.apache.org/jira/browse/YARN-7252
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jonathan Hung
>            Assignee: Jonathan Hung
>            Priority: Critical
>             Fix For: YARN-5734
>
>         Attachments: YARN-7252-YARN-5734.001.patch, YARN-7252-YARN-5734.002.patch
>
>
> Scenario: rm1 and rm2, starting configuration with root.default, root.a. rm1 is active.
First, put root.a into STOPPED state, then remove it. Then put rm1 in standby and rm2 in active.
Here's the exception: {noformat}Operation failed: Error on refreshAll during transition to
Active
> 	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:315)
> 	at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
> 	at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: RefreshAll operation failed
> 	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:747)
> 	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:307)
> 	... 10 more
> Caused by: java.io.IOException: Failed to re-init queues : root.a is deleted from the
new capacity scheduler configuration, but the queue is not yet in stopped state. Current State
: RUNNING
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:436)
> 	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:405)
> 	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:736)
> 	... 11 more
> Caused by: java.io.IOException: root.a is deleted from the new capacity scheduler configuration,
but the queue is not yet in stopped state. Current State : RUNNING
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.validateQueueHierarchy(CapacitySchedulerQueueManager.java:312)
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.reinitializeQueues(CapacitySchedulerQueueManager.java:174)
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:648)
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:432)
> 	... 13 more{noformat}
> Seems rm2 does not think root.a was STOPPED, so when it can't find root.a and sees it
is deleted, it throws exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message