hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
Date Fri, 17 Mar 2017 00:01:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929203#comment-15929203
] 

Andrew Wang commented on HDFS-6984:
-----------------------------------

Hi Chris, replies inline, lots of agreement with your direction:

bq. Should we also try to remove Writable from FsPermission? We could deprecate the Writable
API instead of removing it from these classes, in case projects/users depend on it downstream...
the serialization/conversion can still live in a library, but be called from the deprecated
methods.

SGTM

bq. Since both FsPermission#getAclBit and FsPermission#getEncryptedBit/FileStatus#isEncrypted
are user-facing, should these also be part of FSProtos? ...While we're at it, should we also
change HdfsFileStatusProto to stop packing the acl/encryption bits among the permission bits?

Also sounds great, I'd love a bitfield rather than stuffing these in FsPermission.

bq. is there a reason encryption info is included in HdfsFileStatus, but ACLs are not? Would
it be inappropriate to add a FileSystem#getAclStatus(FileStatus), in case an implementation
returns this information in its response (potentially avoiding the 2-RPC overhead)?

We need the FEInfo to read or write a file, and the ACLs are rarely needed by the client since
they're enforced server-side.

I don't think there are any plans to include ACLs in HdfsFileStatus because of the bloat,
but your suggestion makes sense if there is such an FS implementation. Good follow-on.

> In Hadoop 3, make FileStatus serialize itself via protobuf
> ----------------------------------------------------------
>
>                 Key: HDFS-6984
>                 URL: https://issues.apache.org/jira/browse/HDFS-6984
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Colin P. McCabe
>            Assignee: Colin P. McCabe
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, HDFS-6984.003.patch, HDFS-6984.004.patch,
HDFS-6984.005.patch, HDFS-6984.nowritable.patch
>
>
> FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this to serialize
it and send it over the wire.  But in Hadoop 2 and later, we have the protobuf {{HdfsFileStatusProto}}
which serves to serialize this information.  The protobuf form is preferable, since it allows
us to add new fields in a backwards-compatible way.  Another issue is that already a lot of
subclasses of FileStatus don't override the Writable methods of the superclass, breaking the
interface contract that read(status.write) should be equal to the original status.
> In Hadoop 3, we should just make FileStatus serialize itself via protobuf so that we
don't have to deal with these issues.  It's probably too late to do this in Hadoop 2, since
user code may be relying on the existing FileStatus serialization there.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message