hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11565) Use compact identifiers for built-in ECPolicies in HdfsFileStatus
Date Thu, 06 Apr 2017 23:54:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959995#comment-15959995

Wei-Chiu Chuang commented on HDFS-11565:

[~andrew.wang] thanks for working it. The patch itself looks reasonable. Let's review it after
HDFS-11623 is checked in.

One issue I saw is 
+    if (policy == null) {
+      return new ErasureCodingPolicy(proto.getName(),
+          convertECSchema(proto.getSchema()),
+          proto.getCellSize(), id);
+    }
This means a new ErasureCodingPolicy object each time it is called. Shouldn't it be cached

> Use compact identifiers for built-in ECPolicies in HdfsFileStatus
> -----------------------------------------------------------------
>                 Key: HDFS-11565
>                 URL: https://issues.apache.org/jira/browse/HDFS-11565
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: erasure-coding
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>            Priority: Blocker
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-11565.001.patch
> Discussed briefly on HDFS-7337 with Kai Zheng. Quoting our convo:
> {quote}
> From looking at the protos, one other question I had is about the overhead of these protos
when using the hardcoded policies. There are a bunch of strings and ints, which can be kind
of heavy since they're added to each HdfsFileStatus. Should we make the built-in ones identified
by purely an ID, with these fully specified protos used for the pluggable policies?
> {quote}
> {quote}
> Sounds like this could be considered separately because, either built-in policies or
plugged-in polices, the full meta info is maintained either by the codes or in the fsimage
persisted, so identifying them by purely an ID should works fine. If agree, we could refactor
the codes you mentioned above separately.
> {quote}

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message