hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-6207) Move application can fail when attempt add event is delayed
Date Mon, 20 Feb 2017 20:41:44 GMT

    [ https://issues.apache.org/jira/browse/YARN-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875036#comment-15875036
] 

Varun Saxena edited comment on YARN-6207 at 2/20/17 8:41 PM:
-------------------------------------------------------------

bq. Submit is on dest queue and finish is done on parent.
Right. Actually this was fine pre YARN-5756. Because LeafQueue#finishApplication was just
informing the users manager about deactivating the application and then calling ParentQueue#finishApplication.
As we were merely moving app across queues, informing users manager for app deactivation was
not required.
But now, we are maintaining queue state as well which will have to be updated for source queue
(if queue is DRAINING) on move and hence needs to be handled by calling LeafQueue#appFinished.


was (Author: varun_saxena):
bq. Submit is on dest queue and finish is done on parent.
Right. Actually this was fine pre YARN-5756. Because LeafQueue#finishApplication was just
informing the users manager about deactivating the application and then calling ParentQueue#finishApplication.
As we were merely moving app across queues, informing users manager for app deactivation was
not required.
But now, we are maintaining queue state as well which may have to be updated for source queue
(if queue is DRAINING) on move and hence needs to be handled by calling LeafQueue#appFinished.

> Move application can  fail when attempt add event is delayed
> ------------------------------------------------------------
>
>                 Key: YARN-6207
>                 URL: https://issues.apache.org/jira/browse/YARN-6207
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>         Attachments: YARN-6207.001.patch, YARN-6207.002.patch
>
>
> *Steps to reproduce*
> 1.Submit application  and delay attempt add to Scheduler
> (Simulate using debug at EventDispatcher for SchedulerEventDispatcher)
> 2. Call move application to destination queue.
> {noformat}
> Caused by: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.preValidateMoveApplication(CapacityScheduler.java:2086)
> 	at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.moveApplicationAcrossQueue(RMAppManager.java:669)
> 	at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.moveApplicationAcrossQueues(ClientRMService.java:1231)
> 	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.moveApplicationAcrossQueues(ApplicationClientProtocolPBServiceImpl.java:388)
> 	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:537)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2659)
> 	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1483)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1429)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1339)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:115)
> 	at com.sun.proxy.$Proxy7.moveApplicationAcrossQueues(Unknown Source)
> 	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.moveApplicationAcrossQueues(ApplicationClientProtocolPBClientImpl.java:398)
> 	... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message