Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A2090200C38 for ; Wed, 15 Mar 2017 22:32:48 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id A0A8B160B78; Wed, 15 Mar 2017 21:32:48 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EBEA7160B60 for ; Wed, 15 Mar 2017 22:32:47 +0100 (CET) Received: (qmail 36841 invoked by uid 500); 15 Mar 2017 21:32:47 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 36830 invoked by uid 99); 15 Mar 2017 21:32:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Mar 2017 21:32:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 7CB6DC0333 for ; Wed, 15 Mar 2017 21:32:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.651 X-Spam-Level: X-Spam-Status: No, score=0.651 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id GH4aSd7V28LC for ; Wed, 15 Mar 2017 21:32:45 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id D82785F23D for ; Wed, 15 Mar 2017 21:32:44 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id C1E1FE087E for ; Wed, 15 Mar 2017 21:32:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id F41F6243C4 for ; Wed, 15 Mar 2017 21:32:41 +0000 (UTC) Date: Wed, 15 Mar 2017 21:32:41 +0000 (UTC) From: "Andrew Wang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 15 Mar 2017 21:32:48 -0000 [ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927021#comment-15927021 ] Andrew Wang commented on HDFS-6984: ----------------------------------- Hi Chris, thanks for revving, a few review comments: I was wondering if you saw my comment way back about the ACL bit / encrypted bit / etc: bq. The takeaways for me are that we should make a separate bitfield for these flags. If we want to preserve cross-serialization, we'd need to also add this field to HdfsFileStatus, and we'd always have to be careful with field numbers. Right now, it looks like if you pass in an HdfsFileStatus with these bits set, they're dropped. It'd be good to unit test these getters. If you can think up a unit test to detect the addition of new bits (e.g. isErasureCoded), that'd also be great. Since a lot of fields are optional in the PB, should we also test with these optional fields unset? I'm wondering if the resulting FileStatus is filled in with reasonable defaults. Generally beefing up test coverage would be good too, since it seems like we lost some of the basic "try writing and reading some different statuses" test from TestFileStatus. > In Hadoop 3, make FileStatus serialize itself via protobuf > ---------------------------------------------------------- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 3.0.0-alpha1 > Reporter: Colin P. McCabe > Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this to serialize it and send it over the wire. But in Hadoop 2 and later, we have the protobuf {{HdfsFileStatusProto}} which serves to serialize this information. The protobuf form is preferable, since it allows us to add new fields in a backwards-compatible way. Another issue is that already a lot of subclasses of FileStatus don't override the Writable methods of the superclass, breaking the interface contract that read(status.write) should be equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so that we don't have to deal with these issues. It's probably too late to do this in Hadoop 2, since user code may be relying on the existing FileStatus serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org