hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-15059) 3.0 deployment cannot work with old version MR tar ball which break rolling upgrade
Date Thu, 30 Nov 2017 15:58:00 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jason Lowe updated HADOOP-15059:
--------------------------------
    Attachment: HADOOP-15059.005.patch

Thanks for the reviews, Daryn and Vinod!  I'm attaching a patch that moves the byte value
to enum mapping into the enum itself.  I'm not convinced the refactor was worth it, but here
it is for you to decide.

bq. If only YARN had a system whereby one could dynamically label a node based upon the current
software stack. Then one could schedule around that and/or set up distributed cache content
to match.

The RM tracks which version each NM registered with, so it knows the software stack being
provided by that node.  Unfortunately YARN doesn't know what software stack the user's application
expects, so it cannot fixup the distributed cache to match the expectation.  In addition the
framework could be embedded in the user's app which cannot be fixed in the general case by
tweaking the distributed cache.

> 3.0 deployment cannot work with old version MR tar ball which break rolling upgrade
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-15059
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15059
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: security
>            Reporter: Junping Du
>            Assignee: Jason Lowe
>            Priority: Blocker
>         Attachments: HADOOP-15059.001.patch, HADOOP-15059.002.patch, HADOOP-15059.003.patch,
HADOOP-15059.004.patch, HADOOP-15059.005.patch
>
>
> I tried to deploy 3.0 cluster with 2.9 MR tar ball. The MR job is failed because following
error:
> {noformat}
> 2017-11-21 12:42:50,911 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created
MRAppMaster for application appattempt_1511295641738_0003_000001
> 2017-11-21 12:42:51,070 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to
load native-hadoop library for your platform... using builtin-java classes where applicable
> 2017-11-21 12:42:51,118 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
Error starting MRAppMaster
> java.lang.RuntimeException: Unable to determine current user
> 	at org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:254)
> 	at org.apache.hadoop.conf.Configuration$Resource.<init>(Configuration.java:220)
> 	at org.apache.hadoop.conf.Configuration$Resource.<init>(Configuration.java:212)
> 	at org.apache.hadoop.conf.Configuration.addResource(Configuration.java:888)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1638)
> Caused by: java.io.IOException: Exception reading /tmp/nm-local-dir/usercache/jdu/appcache/application_1511295641738_0003/container_e03_1511295641738_0003_01_000001/container_tokens
> 	at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:208)
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:907)
> 	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:820)
> 	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:689)
> 	at org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:252)
> 	... 4 more
> Caused by: java.io.IOException: Unknown version 1 in token storage.
> 	at org.apache.hadoop.security.Credentials.readTokenStorageStream(Credentials.java:226)
> 	at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:205)
> 	... 8 more
> 2017-11-21 12:42:51,122 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status
1: java.lang.RuntimeException: Unable to determine current user
> {noformat}
> I think it is due to token incompatiblity change between 2.9 and 3.0. As we claim "rolling
upgrade" is supported in Hadoop 3, we should fix this before we ship 3.0 otherwise all MR
running applications will get stuck during/after upgrade.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message