hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
Date Tue, 02 May 2017 16:55:04 GMT

     [ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Chris Douglas updated HDFS-6984:
    Attachment: HDFS-6984.009.patch

Sorry for the noise caused by using Jenkins to run tests. I was trying to avoid an explicit
{{Flags}} type for HdfsFileStatus, but if {{feInfo}} is not present for directories in encryption
zones, it's probably cleaner this way.

We could add a different API for resolving whether some set of features/capabilities are available/applied
for a given {{FileSystem}}/{{FileStatus}}, and avoid the {{EnumSet}} in the constructor, but
this issue has sprawled far enough. At least it's not in the user-facing API.

This restores the {{FsPermissionExtension}} on the client side (for applications calling the
deprecated methods). If the flags are not set (on older servers) it sets the flags from the
old permission bits. For backwards compatibility with 2.x clients, we can either keep setting
the permission bits server side or backport the flags to branch-2 for the "2.x version compatible
with 3.x clusters" release. Any opinions on this? It's simplest to keep setting the bits server-side,
and deleting that code when we remove {{FsPermissionExtension}}.

I also had a question about snapshots. I'm not certain whether EC and encryption zones work
with them. The current patch doesn't handle ACLs, but I wanted to get a quick check on that
before digging through the implementation.

[~andrew.wang], are you OK with the general direction of v009?

> In Hadoop 3, make FileStatus serialize itself via protobuf
> ----------------------------------------------------------
>                 Key: HDFS-6984
>                 URL: https://issues.apache.org/jira/browse/HDFS-6984
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Colin P. McCabe
>            Assignee: Colin P. McCabe
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, HDFS-6984.003.patch, HDFS-6984.004.patch,
HDFS-6984.005.patch, HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, HDFS-6984.009.patch,
> FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this to serialize
it and send it over the wire.  But in Hadoop 2 and later, we have the protobuf {{HdfsFileStatusProto}}
which serves to serialize this information.  The protobuf form is preferable, since it allows
us to add new fields in a backwards-compatible way.  Another issue is that already a lot of
subclasses of FileStatus don't override the Writable methods of the superclass, breaking the
interface contract that read(status.write) should be equal to the original status.
> In Hadoop 3, we should just make FileStatus serialize itself via protobuf so that we
don't have to deal with these issues.  It's probably too late to do this in Hadoop 2, since
user code may be relying on the existing FileStatus serialization there.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message