hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones
Date Thu, 27 Aug 2015 17:42:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717142#comment-14717142
] 

Zhe Zhang commented on HDFS-8833:
---------------------------------

Thanks for the comment Kai. Great to know that the branch is working reliably.

Although the size of the patch is large, most of it is refactoring (changing CLI, RPC etc.).
There are only a few non-trivial changes as [summarized | https://issues.apache.org/jira/browse/HDFS-8833?focusedCommentId=14700453&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14700453]
above.

I think it's worthwhile to finalize the *semantics* of using EC while it's still in a feature
branch. This change also affects the fsimage format (whether ECPolicy ID is stored per file),
which we should also finalize before merging to trunk.

> Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC
zones
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8833
>                 URL: https://issues.apache.org/jira/browse/HDFS-8833
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8833-HDFS-7285-merge.00.patch, HDFS-8833-HDFS-7285-merge.01.patch,
HDFS-8833-HDFS-7285.02.patch
>
>
> We have [discussed | https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
storing EC schema with files instead of EC zones and recently revisited the discussion under
HDFS-8059.
> As a recap, the _zone_ concept has severe limitations including renaming and nested configuration.
Those limitations are valid in encryption for security reasons and it doesn't make sense to
carry them over in EC.
> This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For simplicity,
we should first implement it as an xattr and consider memory optimizations (such as moving
it to file header) as a follow-on. We should also disable changing EC policy on a non-empty
file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message