hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11082) Erasure Coding : Provide replicated EC policy to just replicating the files
Date Thu, 03 Aug 2017 21:49:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113552#comment-16113552

Andrew Wang commented on HDFS-11082:

Hi Sammi, this looks good overall, thanks for working on this! A few review comments:

* We should add documentation and javadocs describing this new special policy so users and
admins can be aware
* Also need to think about the behavior of {{getErasureCodingPolicy}}. Right now it returns
"null" to mean replication. With this patch, a user would have to check both for "null" and
"replication-1-2-64K" to know if it's replicated. It'd be good to choose one or the other
to make it simpler for downstreams. "null" would be more compatible, and it'd hide the special
replicated EC policy from non-admin users which I like.
* Please add messages to the asserts in the tests to help with later debugging
* Is this policy enabled by default? I think it should be if not.
* Would be nice to rename the paths in the test cases to be more descriptive. As an example,
right now we have:

723	    final Path rootPath = new Path("/striped");
724	    final Path childPath = new Path(rootPath, "replica");
725	    final Path subChildPath = new Path(childPath, "replica");
726	    final Path filePath = new Path(childPath, "file");
727	    final Path filePath2 = new Path(subChildPath, "file");

Instead, perhaps something more like:

723	    final Path rootPath = new Path("/striped");
724	    final Path replicaPath = new Path(rootPath, "replica");
725	    final Path subReplicaPath = new Path(replicaPath, "subreplica");
726	    final Path replicaFilePath = new Path(replicaPath, "file");
727	    final Path subReplicaFilePath = new Path(subReplicaPath, "file");

This is not directly related (and I think we discussed this a bit on another JIRA) but I'm
not happy with our getECPolicy API right now. Right now it returns the effective EC policy.
Without being able to query the actual EC policy, the behavior when setting/unsetting is kind
of tricky. Should we add an "getActualECPolicy" API? Can be a follow-on JIRA.

If you don't mind, one immediate improvement we could make is documenting in the {{getECPolicy}}
javadoc that it returns the effective EC policy.

> Erasure Coding : Provide replicated EC policy to just replicating the files
> ---------------------------------------------------------------------------
>                 Key: HDFS-11082
>                 URL: https://issues.apache.org/jira/browse/HDFS-11082
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>            Reporter: Rakesh R
>            Assignee: SammiChen
>            Priority: Critical
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-11082.001.patch
> The idea of this jira is to provide a new {{replicated EC policy}} so that we can override
the EC policy on a parent directory and go back to just replicating the files based on replication
> Thanks [~andrew.wang] for the [discussions|https://issues.apache.org/jira/browse/HDFS-11072?focusedCommentId=15620743&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15620743].

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message