hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3727) jobtoken location property in jobconf refers to wrong jobtoken file
Date Wed, 25 Jan 2012 23:06:42 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193428#comment-13193428

Alejandro Abdelnur commented on MAPREDUCE-3727:

Vinod, thanks for confirming this is an issue in trunk as well.

It is not possible to do it at Oozie level, the property is used within the scope of the hadoop
submission code (by TokenCache, ie, computing splits), hadoop submission code should remove
this property from the jobconf before writing the jobconf to hdfs.

I guess a warn and continue would work too. but it would be quite confusing to get a warning
for a file from some other jobtoken file.

I think we should remove on submission (as I state before) as that property (current value)
is meaningful only in the context the the current submission. If somebody is doing a submission
from the submitted job (like Oozie), then they should start the whole cycle again (getting
the ENV var and setting the property again in the context of the submitted job).
> jobtoken location property in jobconf refers to wrong jobtoken file
> -------------------------------------------------------------------
>                 Key: MAPREDUCE-3727
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3727
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 1.0.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>            Priority: Critical
>             Fix For: 1.1.0
> Oozie launcher job (for MR/Pig/Hive/Sqoop action) reads the location of the jobtoken
file from the *HADOOP_TOKEN_FILE_LOCATION* ENV var and seeds it as the *mapreduce.job.credentials.binary*
property in the jobconf that will be used to launch the real (MR/Pig/Hive/Sqoop) job.
> The MR/Pig/Hive/Sqoop submission code (via Hadoop job submission) uses correctly the
injected *mapreduce.job.credentials.binary* property to load the credentials and submit their
MR jobs.
> The problem is that the *mapreduce.job.credentials.binary* property also makes it to
the tasks of the MR/Pig/Hive/Sqoop MR jobs.
> If for some reason the MR/Pig/Hive/Sqoop MR code does some logic that triggers the credential
loading, because the property is set, the credential loading fails trying to load a jobtoken
file of the launcher job which does not exists in the context of the MR/Pig/Hive/Sqoop jobs.
> More specifically, we are seeing this happening with certain hive queries that trigger
a conditional code within their RowContainer which then uses the FileInputFormat.getSplits()
and then the TokenCache tries to load credentials for a file that is for the wrong job.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message