Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5F79BD2FD for ; Wed, 22 Aug 2012 19:43:44 +0000 (UTC) Received: (qmail 63628 invoked by uid 500); 22 Aug 2012 19:43:43 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 63506 invoked by uid 500); 22 Aug 2012 19:43:43 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 63271 invoked by uid 99); 22 Aug 2012 19:43:43 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Aug 2012 19:43:43 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id F02A12C091C for ; Wed, 22 Aug 2012 19:43:42 +0000 (UTC) Date: Thu, 23 Aug 2012 06:43:42 +1100 (NCT) From: "Daryn Sharp (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1799589211.2061.1345664622984.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HADOOP-8225) DistCp fails when invoked by Oozie MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439792#comment-13439792 ] Daryn Sharp commented on HADOOP-8225: ------------------------------------- This is actually caused by multiple bugs: # MR job submission requests tokens it already has # MR job submission doesn't always pass all tokens (JHS, MR, HIVE, etc) # Oozie is using a devious way to detect the exit code of an action Details: # The reported exception occurs when a task tries to get tokens it ALREADY has. Job submission gets missing tokens for input/output paths and adds them to the UGI for RPC connections. Job submission doesn't check the UGI, so it doesn't think it has the token, thus requests another. The NN connection uses the token that the job doesn't think it has! The NN squawks that you can't use a token to get a token. # Similarly, distcp also does some prep work to acquire tokens prior to job submission. So again, a task tries and fails to get the tokens it already has.... Invoking a command like distcp directly will "work" (masks the bug) because it uses the TGT to get another token even if it already has one in the UGI. # Job submission doesn't appear to propagate non-FS/MR tokens in the task's UGI into the new job submission. # Oozie uses a security manager to intercept an action's System.exit, throws a SecurityException containing the exit code, and later catches that exception to determine success/failure. Devious! Distcp calls System.exit(0) inside a try block which catches oozie's SecurityException, logs it, and then calls System.exit(-999), again generating an oozie SecurityException. Due to the try/catch, distcp will ALWAYS appear to fail. Solutions: * #1,2,3: Seeding Job with existing UGI tokens * #4 Distcp calls System.exit OUTSIDE of the try block I seeded the Job's credentials with the existing UGI tokens because it seems unreasonable to require all apps that launch jobs to be aware of running as a task. > DistCp fails when invoked by Oozie > ---------------------------------- > > Key: HADOOP-8225 > URL: https://issues.apache.org/jira/browse/HADOOP-8225 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 0.23.1 > Reporter: Mithun Radhakrishnan > Assignee: Daryn Sharp > Attachments: HADOOP-8225.patch, HADOOP-8225.patch, HADOOP-8225.patch > > > When DistCp is invoked through a proxy-user (e.g. through Oozie), the delegation-token-store isn't picked up by DistCp correctly. One sees failures such as: > ERROR [main] org.apache.hadoop.tools.DistCp: Couldn't complete DistCp > operation: > java.lang.SecurityException: Intercepted System.exit(-999) > at > org.apache.oozie.action.hadoop.LauncherSecurityManager.checkExit(LauncherMapper.java:651) > at java.lang.Runtime.exit(Runtime.java:88) > at java.lang.System.exit(System.java:904) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:357) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:394) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:147) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:142) > Looking over the DistCp code, one sees that HADOOP_TOKEN_FILE_LOCATION isn't being copied to mapreduce.job.credentials.binary, in the job-conf. I'll post a patch for this shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira