hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (YARN-8290) Yarn application failed to recover with "Error Launching job : User is not set in the application report" error after RM restart
Date Thu, 17 May 2018 00:20:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eric Yang reassigned YARN-8290:
-------------------------------

             Assignee: Eric Yang
    Affects Version/s: 3.1.1

[~leftnoteasy] According to your suggestion that ACL information is set too late and killing
AM prior to ACL information is propagated can cause RM recovery to load partial application
record.  The suggested change is to move the ACL setup into ApplicationToSchedulerTransition.
 The patch moved the block of code accordingly.  Let me know if this is the correct fix. 
Thanks

> Yarn application failed to recover with "Error Launching job : User is not set in the
application report" error after RM restart
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8290
>                 URL: https://issues.apache.org/jira/browse/YARN-8290
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 3.1.1
>            Reporter: Yesha Vora
>            Assignee: Eric Yang
>            Priority: Major
>         Attachments: YARN-8290.001.patch
>
>
> Scenario:
> 1) Start 5 streaming application in background
> 2) Kill Active RM and cause RM failover
> After RM failover, The application failed with below error.
> {code}18/02/01 21:24:29 WARN client.RequestHedgingRMFailoverProxyProvider: Invocation
returned exception on [rm2] : org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException:
Application with id 'application_1517520038847_0003' doesn't exist in RM. Please check that
the job submission was successful.
> 	at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:338)
> 	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
> 	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
> , so propagating back to caller.
> 18/02/01 21:24:29 INFO impl.YarnClientImpl: Submitted application application_1517520038847_0003
> 18/02/01 21:24:30 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/hrt_qa/.staging/job_1517520038847_0003
> 18/02/01 21:24:30 ERROR streaming.StreamJob: Error Launching job : User is not set in
the application report
> Streaming Command Failed!{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message