Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 42B54200B9D for ; Thu, 13 Oct 2016 14:08:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 41434160AE3; Thu, 13 Oct 2016 12:08:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 899A7160AE4 for ; Thu, 13 Oct 2016 14:08:22 +0200 (CEST) Received: (qmail 50869 invoked by uid 500); 13 Oct 2016 12:08:21 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 50506 invoked by uid 99); 13 Oct 2016 12:08:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Oct 2016 12:08:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id AEA922C4C7B for ; Thu, 13 Oct 2016 12:08:20 +0000 (UTC) Date: Thu, 13 Oct 2016 12:08:20 +0000 (UTC) From: "Junping Du (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 13 Oct 2016 12:08:23 -0000 [ https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6792: ---------------------------------- Status: Patch Available (was: Open) Submit the patch for kick off Jenkins' test. The patch looks good in overall. Several comments: 1. {{fileOwner.equalsIgnoreCase(currentUser.getUserName())}} - I think our current assumption in hadoop is user name should be case sensitive, so user and USER are treated as different user. In AzureFS or other similar cloud based FS, do we change the assumption here especially for domain name? If not, we should keep case sensitive check here. 2. The exception message include all possible usernames, it could be duplicated in case login user = real user (in case no proxy user get used). So we should do a quick check and only log both when login user != real user. Isn't it? 3. It would be great if we can figure out some way to add unit test for use case that we are adding here. > Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir() > -------------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-6792 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client > Reporter: Santhosh G Nayak > Assignee: Santhosh G Nayak > Attachments: MAPREDUCE-6792.1.patch > > > Background - > Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner returned as part of {{FileSystem#getFileStatus()}} is always user's short principal name, which is true for HDFS. But, some file systems which are HDFS compatible like [Azure Data Lake Store (ADLS) |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in multi tenant environment can have users with same names belonging to different domains. For example, {{user1@company1.com}} and {{user1@company2.com}}. It will be ambiguous, if {{FileSystem#getFileStatus()}} returns only the user's short principal name (without domain name) as the owner of the file/directory. > The following code block allows only short user principal name as owner. It simply fails saying that ownership on the staging directory is not as expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to short principal name of the current user. > {code} > String realUser; > String currentUser; > UserGroupInformation ugi = UserGroupInformation.getLoginUser(); > realUser = ugi.getShortUserName(); > currentUser = UserGroupInformation.getCurrentUser().getShortUserName(); > if (fs.exists(stagingArea)) { > FileStatus fsStatus = fs.getFileStatus(stagingArea); > String owner = fsStatus.getOwner(); > if (!(owner.equals(currentUser) || owner.equals(realUser))) { > throw new IOException("The ownership on the staging directory " + > stagingArea + " is not as expected. " + > "It is owned by " + owner + ". The directory must " + > "be owned by the submitter " + currentUser + " or " + > "by " + realUser); > } > {code} > The proposal is to remove the strict restriction on short principal name by allowing the user's full principal name as owner of staging area directory in {{JobSubmissionFiles#JobStagingDir()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org