hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1321) NMTokenCache is a a singleton, prevents multiple AMs running in a single JVM to work correctly
Date Sat, 19 Oct 2013 18:08:47 GMT

    [ https://issues.apache.org/jira/browse/YARN-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799963#comment-13799963

Vinod Kumar Vavilapalli commented on YARN-1321:

bq. Llama is a single JVM hosting multiple unmanaged ApplicationMasters that run at the same
time (in parallel). Because NMTokenCache is a singleton NMTokens for the same node from the
different AMs step on each other.
Okay, that explains the context.

bq. So far this is the only issue we've run while using multiple AMs in a single JVM.
That is good to know. You should add some kind of simple test so that so that this assumption
isn't broken in the future.

bq. This seems like that after this patch goes in, all applications will need to change to
work correctly with the client libraries?
Sigh, that is true. Changing from static to non-static breaks apps. We can do one of the two
 - Keep the statics around for single AM per JVM case - which I believe will cover 99% cases
and add new non-static APIs or
 - Doing something that Omkar is suggesting - add optional APIs to track NMTokens per appattempt.

Irrespective of the solution, I think we should skip the MR and dist-shell changes altogether
- atleast to prove that the changes are compatible. We can may be fix them in a follow up

> NMTokenCache is a a singleton, prevents multiple AMs running in a single JVM to work
> ----------------------------------------------------------------------------------------------
>                 Key: YARN-1321
>                 URL: https://issues.apache.org/jira/browse/YARN-1321
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.2.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>            Priority: Blocker
>             Fix For: 2.2.1
>         Attachments: YARN-1321.patch
> NMTokenCache is a singleton. Because of this, if running multiple AMs in a single JVM
NMTokens for the same node from different AMs step on each other and starting containers fail
due to mismatch tokens.
> The error observed in the client side is something like:
> {code}
> ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:llama
(auth:PROXY) via llama (auth:SIMPLE) cause:org.apache.hadoop.yarn.exceptions.YarnException:
Unauthorized request to start container. 
> NMToken for application attempt : appattempt_1382038445650_0002_000001 was used for starting
container with container token issued for application attempt : appattempt_1382038445650_0001_000001
> {code}

This message was sent by Atlassian JIRA

View raw message