Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id F41692004F5 for ; Fri, 1 Sep 2017 23:37:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id F282216DC73; Fri, 1 Sep 2017 21:37:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1EE7216DC71 for ; Fri, 1 Sep 2017 23:37:08 +0200 (CEST) Received: (qmail 74496 invoked by uid 500); 1 Sep 2017 21:37:07 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 74485 invoked by uid 99); 1 Sep 2017 21:37:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Sep 2017 21:37:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 313D9C3A66 for ; Fri, 1 Sep 2017 21:37:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id HNgBoHXNUezM for ; Fri, 1 Sep 2017 21:37:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 6D51661268 for ; Fri, 1 Sep 2017 21:37:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D9CC3E0E56 for ; Fri, 1 Sep 2017 21:37:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2CEAE24157 for ; Fri, 1 Sep 2017 21:37:00 +0000 (UTC) Date: Fri, 1 Sep 2017 21:37:00 +0000 (UTC) From: "Yongjun Zhang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (HDFS-12357) Let NameNode to bypass external attribute provider for special user MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 01 Sep 2017 21:37:10 -0000 [ https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151047#comment-16151047 ] Yongjun Zhang edited comment on HDFS-12357 at 9/1/17 9:36 PM: -------------------------------------------------------------- Thanks [~chris.douglas] and [~manojg]. Sorry for a lengthy reply here: {quote} Would a filter implementation wrapping the configured, external attribute provider suffice? {quote} The current patch implements this logic (like an inlined version of the wrapper class in C++ world). If we put this logic to the wrapper class, I can see some issues: 1. the wrapper need to create two provider objects, one is the default (HDFS), the other is the external provider, and switch between these two. However, in the existing code, I don't see the default provider object is always created. See 2.a below. The default value of the following config is empty, which means no default provider will be created. {code} dfs.namenode.inode.attributes.provider.class Name of class to use for delegating HDFS authorization. {code} Not sure whether we should have the default provider configured here. 2. currently there are two places to decide whether to consult external attribute provider 2.a. {code} INodeAttributes getAttributes(INodesInPath iip) throws FileNotFoundException { INode node = FSDirectory.resolveLastINode(iip); int snapshot = iip.getPathSnapshotId(); INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot); if (attributeProvider != null) { // permission checking sends the full components array including the // first empty component for the root. however file status // related calls are expected to strip out the root component according // to TestINodeAttributeProvider. byte[][] components = iip.getPathComponents(); components = Arrays.copyOfRange(components, 1, components.length); nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs); } return nodeAttrs; } {code} we already got the attributes from HDFS, then we decide to whether to overwrite it with provider's data. The easiest way is to check if the user is a special user, then we don't ask for provider's data at all. If we do this in a wrapper class, we always have to get some attributes, which maybe from HDFS or not. It's not a clear implementation and may incur runtime cost. 2.b {code} @VisibleForTesting FSPermissionChecker getPermissionChecker(String fsOwner, String superGroup, UserGroupInformation ugi) throws AccessControlException { return new FSPermissionChecker( fsOwner, superGroup, ugi, attributeProvider); } {code} Here we need to pass either a null or the external attributeProvider configured to permission checker. if we include this logic to the external provider, we need have an API in this wrapper class, to return the external provicer or null, and pass it to the "attributeProvider" parameter in the above code. like {code} return new FSPermissionChecker( fsOwner, superGroup, ugi, attributeProvider.getRealAttributeProvider()); {code} We need to add this getRealAttibuteProvider() API to the base provider class, which is a bit weird because this API is only meaning ful in the wrapper layer. And changing the provider API is what we try to avoid here. Thoughts? Thanks. was (Author: yzhangal): Thanks [~chris.douglas] and [~manojg]. Sorry for a lengthy reply here: {quote} Would a filter implementation wrapping the configured, external attribute provider suffice? {quote} The current patch implements this logic (like an inlined version of the wrapper class in C++ world). If we put this logic to the wrapper class, I can see some issues: 1. the wrapper need to create two provider objects, one is the default (HDFS), the other is the external provider, and switch between these two. However, in the existing code, I don't see the default provider object is always created. See 2.a below. The default value of the following config is empty, which means no default provider will be created. {code} dfs.namenode.inode.attributes.provider.class Name of class to use for delegating HDFS authorization. {code} Not sure whether we should have the default provider configured here. 2. currently there are two places to decide whether to consult external attribute provider 2.a. {code} INodeAttributes getAttributes(INodesInPath iip) throws FileNotFoundException { INode node = FSDirectory.resolveLastINode(iip); int snapshot = iip.getPathSnapshotId(); INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot); if (attributeProvider != null) { // permission checking sends the full components array including the // first empty component for the root. however file status // related calls are expected to strip out the root component according // to TestINodeAttributeProvider. byte[][] components = iip.getPathComponents(); components = Arrays.copyOfRange(components, 1, components.length); nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs); } return nodeAttrs; } {code} we already got the attributes from HDFS, then we decide to whether to overwrite it with provider's data. The easiest way is to check if the user is a special user, then we don't ask for provider's data at all. If we do this in a wrapper class, we always have to get some attributes, which maybe from HDFS or not. It's not a clear implementation and may incur runtime cost. 2.b {code} @VisibleForTesting FSPermissionChecker getPermissionChecker(String fsOwner, String superGroup, UserGroupInformation ugi) throws AccessControlException { return new FSPermissionChecker( fsOwner, superGroup, ugi, attributeProvider); } {code} Here we need to pass either a null or the external attributeProvider configured to permission checker. if we include this logic to the external provider, we need have an API in this wrapper class, to return the external provicer or null, and pass it to the "attributeProvider" parameter in the above code. like {code} return new FSPermissionChecker( fsOwner, superGroup, ugi, attributeProvider.getRealAttributeProvider()); {code} We need to add this getRealAttibuteProvider() API to the base provider class, which is a bit weird because this API is only meaning ful in the wrapper layer. Thoughts? Thanks. > Let NameNode to bypass external attribute provider for special user > ------------------------------------------------------------------- > > Key: HDFS-12357 > URL: https://issues.apache.org/jira/browse/HDFS-12357 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Yongjun Zhang > Assignee: Yongjun Zhang > Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch > > > This is a third proposal to solve the problem described in HDFS-12202. > The problem is, when we do distcp from one cluster to another (or within the same cluster), in addition to copying file data, we copy the metadata from source to target. If external attribute provider is enabled, the metadata may be read from the provider, thus provider data read from source may be saved to target HDFS. > We want to avoid saving metadata from external provider to HDFS, so we want to bypass external provider when doing the distcp (or hadoop fs -cp) operation. > Two alternative approaches were proposed earlier, one in HDFS-12202, the other in HDFS-12294. The proposal here is the third one. > The idea is, we introduce a new config, that specifies a special user (or a list of users), and let NN bypass external provider when the current user is a special user. > If we run applications as the special user that need data from external attribute provider, then it won't work. So the constraint on this approach is, the special users here should not run applications that need data from external provider. > Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], [~manojg] for the discussions in the other jiras. > I'm creating this one to discuss further. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org