hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charles Lamb (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-6007) Create a new option for distcp -p which causes raw.* namespace extended attributes to not be preserved
Date Tue, 05 Aug 2014 16:52:12 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Charles Lamb updated MAPREDUCE-6007:

    Attachment: MAPREDUCE-6007.002.patch


-px : preserve raw and non-raw xattrs
-pr : no xattrs are preserved
-p  : preserve raw xattrs
-pxr: preserve non-raw xattrs
    : no xattrs are preserved


Yes, this is almost all correct. Assuming that -pr (above) is the same as -pd in the patch,
then the only typo above is the last line which should be "no raw xattrs are preserved", but
I think you had the "raw" part implied so I throw that in just to be crystal clear.

    raw src, raw dst: the options apply as specified above
    raw src, not-raw dst, dst supports xattrs but no /reserved/.raw: we will fail to set raw
xattrs at runtime.
    raw src, dst doesn't support xattrs: if -pX is specified, throws an exception. Else, silently
discards raw xattrs.

bq. If the src is /reserved/.raw, the user is expecting preservation of raw xattrs when -p
or -pX is specified. In this scenario, we should test that the dest is /.reserved/raw and
that it's present on the dstFS. There might be other weird cases, haven't thought through
all of them

Good idea. I've cleaned this up so that anytime there's a /.reserved/raw src path present,
and either pd or a non- /.reserved/raw target path is specified, an exception is thrown (actually
a DistCpConstants.INVALID_ARGUMENT exit code is returned and a message logged). This will
presumably help admins from shooting themselves in the foot.

bq. We have both noPreserveRaw and preserveRaw booleans, can we standardize on one everywhere?
I'd like a negative one, call it disableRaw or excludeRaw since it better captures the meaning
of the flag. exclude feels a bit better IMO, but it looks like -pe is taken.
I changed all of the booleans to be "disableRawXattrs" (vs disableRaw). This way it matches
the "preserveXattrs" siblings in the code. However, in SimpleCopyList#doBuildListing, the
boolean is still named copyRawXAttrs to mirror its copyAcls and copyXAttrs siblings. It felt
wrong to change that to a disableRawXattrs. Ditto #traverseNonEmptyDirectory. Not to mention
that if we negated the sense of the boolean from copyRawXattrs to disableRawXattrs, it would
hair up the calls to toCopyListFileStatus when they are &&'d with child.isDirectory().
It feels like it's cleaner to leave all three booleans in the positive in these two cases.

I left the name of the enum the same (DISABLERAWXATTRS) since (1) it is an available letter
("d"), and (2) I think you were only suggesting changing the name of the booleans in the code.
I can change it to DISABLERAW if you prefer.

bq.  What's the expected behavior when the dest doesn't support xattrs or reserved raw, or
supports xattrs but not reserved raw?

If the dest doesn't support xattrs or raw, then (assuming -px was specified) an exception
is thrown (per current behavior). If the dest supports xattrs but not reserved raw, then it
depends on whether reserved raw was specified in the source or not. If it was, then an INVALID_ARGUMENT
(-1) is returned as the exit code. If reserved/raw was not specified on any source path, then

bq.    CopyListing, this is where we'd also test to see if the destFS has a /.reserved/raw

Yes, I've added a test for src/dst reserved/raw mismatch here.

I've updated the documentation per your suggestions.


    if (!Path.getPathWithoutSchemeAndAuthority(target).toString().

What if the target is a relative path here?

Check me on this. I convinced myself that a relative path could never be relative to /.reserved/raw
since you can't set your working directory to that.

bq. Any reason this isn't part of the existing XAttr test? They seem pretty similar, and you
also added a PXD test to the existing test.

I started out with a combined test, but during development, they diverged enough in terms
of structure, @BeforeClass, files that they operated on, initializing files, directories,
xattrs, etc. that it felt cleaner to leave them split into two tests.

> Create a new option for distcp -p which causes raw.* namespace extended attributes to
not be preserved
> ------------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-6007
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: distcp
>    Affects Versions: fs-encryption
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>         Attachments: MAPREDUCE-6007.001.patch, MAPREDUCE-6007.002.patch
> As part of the Data at Rest Encryption work (HDFS-6134), we need to create a new option
for distcp which causes raw.* namespace extended attributes to not be preserved. See the doc
in HDFS-6509 for details. The default for this option will be to preserve raw.* xattrs.

This message was sent by Atlassian JIRA

View raw message