hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11382) Persist Erasure Coding Policy ID in a new optional field in INodeFile in FSImage
Date Mon, 27 Feb 2017 23:29:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886759#comment-15886759

Andrew Wang commented on HDFS-11382:

Thanks for the rev Manoj, LGTM overall, +1 pending these little nits:

* FSDirWriteFileOp: typo "replictaionFactor" in addFileForEditLog
* Unused import for Preconditions in INodeFileAttributes

[~ehiggs], my concern is that encoding whether a file is erasure coded in both the EC policy
and the BlockTypeProto fields opens us up to possible incongruity between the two fields.
Since I'm not proposing we do away with BlockType entirely, I double checked the Precondition
checks we have in this patch, and it looks okay.

Also as an FYI, HDFS-8030 wants to implement "contiguous EC," so we need a JIRA to rename
CONTIGUOUS to REPLICATED. I filed HDFS-11465 for this if you want to pick it up, should be
pretty easy to do this refactoring with IDE assistance.

> Persist Erasure Coding Policy ID in a new optional field in INodeFile in FSImage
> --------------------------------------------------------------------------------
>                 Key: HDFS-11382
>                 URL: https://issues.apache.org/jira/browse/HDFS-11382
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: hdfs
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>         Attachments: HDFS-11382.01.patch, HDFS-11382.02.patch, HDFS-11382.03.patch, HDFS-11382.04.patch
> For Erasure Coded files, replication field in INodeFile message is re-used  to store
the EC Policy ID. 
> *FSDirWriteFileOp#addFile*
> {noformat}
>   private static INodesInPath addFile(
>       FSDirectory fsd, INodesInPath existing, byte[] localName,
>       PermissionStatus permissions, short replication, long preferredBlockSize,
>       String clientName, String clientMachine)
>       throws IOException {
>     .. .. ..
>     try {
>       ErasureCodingPolicy ecPolicy = FSDirErasureCodingOp.
>           getErasureCodingPolicy(fsd.getFSNamesystem(), existing);
>       if (ecPolicy != null) {
>         replication = ecPolicy.getId();   <===
>       }
>       final BlockType blockType = ecPolicy != null?
>           BlockType.STRIPED : BlockType.CONTIGUOUS;
>       INodeFile newNode = newINodeFile(fsd.allocateNewInodeId(), permissions,
>           modTime, modTime, replication, preferredBlockSize, blockType);
>       newNode.setLocalName(localName);
>       newNode.toUnderConstruction(clientName, clientMachine);
>       newiip = fsd.addINode(existing, newNode, permissions.getPermission());
> {noformat}
> With HDFS-11268 fix, {{FSImageFormatPBINode#Loader#loadInodeFile}} is rightly getting
the EC ID from the replication field and then uses the right Policy to construct the blocks.
> *FSImageFormatPBINode#Loader#loadInodeFile*
> {noformat}
>       ErasureCodingPolicy ecPolicy = (blockType == BlockType.STRIPED) ?
>           ErasureCodingPolicyManager.getPolicyByPolicyID((byte) replication) :
>           null;
> {noformat}
> The original intention was to re-use the replication field so the in-memory representation
would be compact. But, this isn't necessary for the on-disk representation. replication is
an optional field, and if we add another optional field for the EC policy, it won't be any
extra space.
> Also, we need to make sure to have the appropriate asserts in place to make sure both
fields aren't set for the same INodeField.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message