hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8225) DistCp fails when invoked by Oozie
Date Wed, 22 Aug 2012 19:43:42 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439792#comment-13439792
] 

Daryn Sharp commented on HADOOP-8225:
-------------------------------------

This is actually caused by multiple bugs:
# MR job submission requests tokens it already has
# MR job submission doesn't always pass all tokens (JHS, MR, HIVE, etc)
# Oozie is using a devious way to detect the exit code of an action

Details:
# The reported exception occurs when a task tries to get tokens it ALREADY has.  Job submission
gets missing tokens for input/output paths and adds them to the UGI for RPC connections. 
Job submission doesn't check the UGI, so it doesn't think it has the token, thus requests
another.  The NN connection uses the token that the job doesn't think it has!  The NN squawks
that you can't use a token to get a token.
# Similarly, distcp also does some prep work to acquire tokens prior to job submission.  So
again, a task tries and fails to get the tokens it already has....  Invoking a command like
distcp directly will "work" (masks the bug) because it uses the TGT to get another token even
if it already has one in the UGI.
# Job submission doesn't appear to propagate non-FS/MR tokens in the task's UGI into the new
job submission.
# Oozie uses a security manager to intercept an action's System.exit, throws a SecurityException
containing the exit code, and later catches that exception to determine success/failure. 
Devious!  Distcp calls System.exit(0) inside a try block which catches oozie's SecurityException,
logs it, and then calls System.exit(-999), again generating an oozie SecurityException.  Due
to the try/catch, distcp will ALWAYS appear to fail.

Solutions:
* #1,2,3: Seeding Job with existing UGI tokens
* #4 Distcp calls System.exit OUTSIDE of the try block

I seeded the Job's credentials with the existing UGI tokens because it seems unreasonable
to require all apps that launch jobs to be aware of running as a task.

                
> DistCp fails when invoked by Oozie
> ----------------------------------
>
>                 Key: HADOOP-8225
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8225
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.23.1
>            Reporter: Mithun Radhakrishnan
>            Assignee: Daryn Sharp
>         Attachments: HADOOP-8225.patch, HADOOP-8225.patch, HADOOP-8225.patch
>
>
> When DistCp is invoked through a proxy-user (e.g. through Oozie), the delegation-token-store
isn't picked up by DistCp correctly. One sees failures such as:
> ERROR [main] org.apache.hadoop.tools.DistCp: Couldn't complete DistCp
> operation: 
> java.lang.SecurityException: Intercepted System.exit(-999)
>     at
> org.apache.oozie.action.hadoop.LauncherSecurityManager.checkExit(LauncherMapper.java:651)
>     at java.lang.Runtime.exit(Runtime.java:88)
>     at java.lang.System.exit(System.java:904)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:357)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:394)
>     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:147)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:142)
> Looking over the DistCp code, one sees that HADOOP_TOKEN_FILE_LOCATION isn't being copied
to mapreduce.job.credentials.binary, in the job-conf. I'll post a patch for this shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message