Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A7921200CD1 for ; Wed, 26 Jul 2017 21:45:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A637516976D; Wed, 26 Jul 2017 19:45:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EC12716976A for ; Wed, 26 Jul 2017 21:45:06 +0200 (CEST) Received: (qmail 71371 invoked by uid 500); 26 Jul 2017 19:45:06 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 71358 invoked by uid 99); 26 Jul 2017 19:45:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Jul 2017 19:45:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 956A91A09E7 for ; Wed, 26 Jul 2017 19:45:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id vbrF0SoMXdVE for ; Wed, 26 Jul 2017 19:45:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 30B0860D37 for ; Wed, 26 Jul 2017 19:45:03 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5D260E0DF5 for ; Wed, 26 Jul 2017 19:45:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id C06A82481A for ; Wed, 26 Jul 2017 19:45:00 +0000 (UTC) Date: Wed, 26 Jul 2017 19:45:00 +0000 (UTC) From: "Yongjun Zhang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HDFS-12202) Provide new set of FileSystem API to bypass external attribute provider MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 26 Jul 2017 19:45:07 -0000 Yongjun Zhang created HDFS-12202: ------------------------------------ Summary: Provide new set of FileSystem API to bypass external attribute provider Key: HDFS-12202 URL: https://issues.apache.org/jira/browse/HDFS-12202 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs, hdfs-client Reporter: Yongjun Zhang HDFS client uses {code} /** * Return a file status object that represents the path. * @param f The path we want information from * @return a FileStatus object * @throws FileNotFoundException when the path does not exist * @throws IOException see specific implementation */ public abstract FileStatus getFileStatus(Path f) throws IOException; /** * List the statuses of the files/directories in the given path if the path is * a directory. *

* Does not guarantee to return the List of files/directories status in a * sorted order. *

* Will not return null. Expect IOException upon access error. * @param f given path * @return the statuses of the files/directories in the given patch * @throws FileNotFoundException when the path does not exist * @throws IOException see specific implementation */ public abstract FileStatus[] listStatus(Path f) throws FileNotFoundException, IOException; {code} to get FileStatus of files. When external attribute provider (INodeAttributeProvider) is enabled for a cluster, the external attribute provider is consulted to get back some relevant info (including ACL, group etc) and returned back in FileStatus, There is a problem here, when we use distcp to copy files from srcCluster to tgtCluster, if srcCluster has external attribute provider enabled, the data we copied would contain data from attribute provider, which we may not want. Create this jira to add a new set of interface for distcp to use, so that distcp can copy HDFS data only and bypass external attribute provider data. The new set API would look like {code} /** * Return a file status object that represents the path. * @param f The path we want information from * @param bypassExtAttrProvider if true, bypass external attr provider * when it's in use. * @return a FileStatus object * @throws FileNotFoundException when the path does not exist * @throws IOException see specific implementation */ public FileStatus getFileStatus(Path f, final boolean bypassExtAttrProvider) throws IOException; /** * List the statuses of the files/directories in the given path if the path is * a directory. *

* Does not guarantee to return the List of files/directories status in a * sorted order. *

* Will not return null. Expect IOException upon access error. * @param f * @param bypassExtAttrProvider if true, bypass external attr provider * when it's in use. * @return * @throws FileNotFoundException * @throws IOException */ public FileStatus[] listStatus(Path f, final boolean bypassExtAttrProvider) throws FileNotFoundException, IOException; {code} So when bypassExtAttrProvider is true, external attribute provider will be bypassed. Thanks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org