hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10983) OIV tool should make an EC file explicit
Date Wed, 08 Feb 2017 00:43:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857122#comment-15857122

Andrew Wang commented on HDFS-10983:

Hi Manoj, thanks for posting the patch and the thoughtful discussion above, some review comments:

One high-level comment upon looking more closely at this, I don't think we should mess with
the delimited output at all. It's kind of a legacy format, and is already missing many of
the new fields added to the fsimage since it was converted to PB. The output is also fragile,
and I know there are some users out there who have built apps on the delimited output since
they reported issues when we broke it when we switched the fsimage to PB. So, IMO we skip
adding these new fields here too. Interested users can always use the XML output instead.

* FSImageLoader, sorry that I wasn't clear on this before, but we shouldn't be adding new
fields to FileStatus that aren't there in the real WebHDFS output. We should expose xattr
information with the {{getXAttrs}} and related APIs, not in {{getFileStatus}}. Note that most
users interact with the web OIV tool via commands like {{hadoop fs -ls webhdfs://....}} not
with curl commands directly. So, if the webhdfs client doesn't understand the field, it won't
show up. Considering this is not really EC related, we could file a different JIRA to add
xattr support to the OIV tool.
* Related to the above, adding blockType to the JSON output also won't show up when listing
using the webhdfs client either. Kai is working on adding getECPolicy support in WebHDFS at
HDFS-11394, after which we can also add support in OIV.
* This doesn't matter if we just skip all the delimited changes, but: I don't think a directory
has a meaningful blockType or can be striped since it doesn't have any data. I'd prefer we
stick to printing the xattrs (which is also generally useful). The class javadoc also recommends
printing nothing rather than a "-" for missing values.
* Same as the above if we skip the delimited changes, but: PBImageTextWriter#getEntry, the
new javadoc param is named differently from the actual param

With the above comments in mind, we aren't left with much in the current patch (basically
just the {{blockType}} name fix).

I think we should complete HDFS-11382 before adding the file EC policy in the XML output since
they're related, but besides that I'm fine with splitting the work among JIRAs however you

> OIV tool should make an EC file explicit
> ----------------------------------------
>                 Key: HDFS-10983
>                 URL: https://issues.apache.org/jira/browse/HDFS-10983
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Wei-Chiu Chuang
>            Assignee: Manoj Govindassamy
>              Labels: hdfs-ec-3.0-nice-to-have
>         Attachments: HDFS-10983.01.patch
> The OIV tool's webhdfs interface does not print if a file is striped or not.
> Also, it prints the file's EC policy ID as replication factor, which is inconsistent
to the output of a typical webhdfs call to the cluster, which always shows replication factor
of 0 for EC files.
> Not just webhdfs, but delimiter output does not print if a file is stripped or not either.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message