hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12294) Let distcp to bypass external attribute provider when calling getFileStatus etc at source cluster
Date Mon, 14 Aug 2017 22:14:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126494#comment-16126494

Chris Douglas commented on HDFS-12294:

[~yzhangal], can you be more specific about the requirements? Perhaps provide an example?

(repeating some comments from HDFS-12202) If the external provider only adds attributes, perhaps
this could be implemented as a filter in distcp. If the external provider also removes attributes,
then perhaps the external attribute provider can be configured to return the original FileStatus
attributes given some rules (e.g., backup user).

> Let distcp to bypass external attribute provider when calling getFileStatus etc at source
> -------------------------------------------------------------------------------------------------
>                 Key: HDFS-12294
>                 URL: https://issues.apache.org/jira/browse/HDFS-12294
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
> This is an alternative solution for HDFS-12202, which proposed introducing a new set
of API, with an additional boolean parameter bypassExtAttrProvider, so to let NN bypass external
attribute provider when getFileStatus. The goal is to avoid distcp from copying attributes
from one cluster's external attribute provider and save to another cluster's fsimage.
> The solution here is, instead of having an additional parameter, encode this parameter
to the path itself, when calling getFileStatus (and some other calls), NN will parse the path,
and figure out that whether external attribute provider need to be bypassed. The suggested
encoding is to have a prefix to the path before calling getFileStatus, e.g. /ab/c becomes
/.reserved/bypassExtAttr/a/b/c. NN will parse the path at the very beginning.
> Thanks much to [~andrew.wang] for this suggestion. The scope of change is smaller and
we don't have to change the FileSystem APIs.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message