hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
Date Tue, 09 May 2017 06:21:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002129#comment-16002129
] 

Andrew Wang commented on HDFS-6984:
-----------------------------------

Thanks again for the continued work here Chris, I want to get this in as badly as you. I spent
a little more time looking at the patch, and I'm comfortable with {{HdfsFileStatus extends
FileStatus}} once we make sure symlinks are treated opaquely.

Symlinks:

* I saw three calls to HdfsFileStatus#getSymlink: FSNamesystem#logAuditEvent, NNRpcServer#getLinkTarget,
JsonUtil#toJsonMap. I think the latter two should be calling {{getSymlinkBytes}} for opacity.
* We should add a warning to HdfsFileStatus#getSymlink about handling symlink targets opaquely,
for future coders.

The rest:

{quote}
Here is the relevant line. It's consistent with the documentation- that is the FsPermission[Extension]
for that path- but including the flag bits in the AclStatus permission field looks more like
an accident of the implementation, not part of its intended API.
{quote}

Good catch. I see you removed this already in the 009 patch, agree that it looks like an accident.

I'm convinced by the rest. Documenting the other pending issues here for completeness:

* Any additional checking for symlink target opacity and conversion to a Path. I only checked
usages when it's a {{HdfsFileStatus}}, not usages of the base {{FileStatus}} method.
* Snapshots potentially not working with EC or encryption? If so, this might be more broadly
an xattrs+snapshots bug.
* Backwards compatibility by still setting the permission bits for old clients.

> In Hadoop 3, make FileStatus serialize itself via protobuf
> ----------------------------------------------------------
>
>                 Key: HDFS-6984
>                 URL: https://issues.apache.org/jira/browse/HDFS-6984
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Colin P. McCabe
>            Assignee: Colin P. McCabe
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, HDFS-6984.003.patch, HDFS-6984.004.patch,
HDFS-6984.005.patch, HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, HDFS-6984.009.patch,
HDFS-6984.nowritable.patch
>
>
> FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this to serialize
it and send it over the wire.  But in Hadoop 2 and later, we have the protobuf {{HdfsFileStatusProto}}
which serves to serialize this information.  The protobuf form is preferable, since it allows
us to add new fields in a backwards-compatible way.  Another issue is that already a lot of
subclasses of FileStatus don't override the Writable methods of the superclass, breaking the
interface contract that read(status.write) should be equal to the original status.
> In Hadoop 3, we should just make FileStatus serialize itself via protobuf so that we
don't have to deal with these issues.  It's probably too late to do this in Hadoop 2, since
user code may be relying on the existing FileStatus serialization there.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message