hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manoj Govindassamy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-11382) Persist Erasure Coding Policy ID in a new optional field in INodeFile in FSImage
Date Tue, 28 Feb 2017 01:16:45 GMT

     [ https://issues.apache.org/jira/browse/HDFS-11382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Manoj Govindassamy updated HDFS-11382:
--------------------------------------
    Release Note: The FSImage on-disk format for INodeFile is changed to additionally include
a field for Erasure Coded files. This optional field 'erasureCodingPolicyID' which is unit32
type is available for all Erasure Coded files and represents the Erasure Coding Policy ID.
Previously, the 'replication' field in INodeFile disk format was overloaded  to represent
the same Erasure Coding Policy ID.

> Persist Erasure Coding Policy ID in a new optional field in INodeFile in FSImage
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-11382
>                 URL: https://issues.apache.org/jira/browse/HDFS-11382
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: hdfs
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>             Fix For: 3.0.0-alpha3
>
>         Attachments: HDFS-11382.01.patch, HDFS-11382.02.patch, HDFS-11382.03.patch, HDFS-11382.04.patch,
HDFS-11382.05.patch
>
>
> For Erasure Coded files, replication field in INodeFile message is re-used  to store
the EC Policy ID. 
> *FSDirWriteFileOp#addFile*
> {noformat}
>   private static INodesInPath addFile(
>       FSDirectory fsd, INodesInPath existing, byte[] localName,
>       PermissionStatus permissions, short replication, long preferredBlockSize,
>       String clientName, String clientMachine)
>       throws IOException {
>     .. .. ..
>     try {
>       ErasureCodingPolicy ecPolicy = FSDirErasureCodingOp.
>           getErasureCodingPolicy(fsd.getFSNamesystem(), existing);
>       if (ecPolicy != null) {
>         replication = ecPolicy.getId();   <===
>       }
>       final BlockType blockType = ecPolicy != null?
>           BlockType.STRIPED : BlockType.CONTIGUOUS;
>       INodeFile newNode = newINodeFile(fsd.allocateNewInodeId(), permissions,
>           modTime, modTime, replication, preferredBlockSize, blockType);
>       newNode.setLocalName(localName);
>       newNode.toUnderConstruction(clientName, clientMachine);
>       newiip = fsd.addINode(existing, newNode, permissions.getPermission());
> {noformat}
> With HDFS-11268 fix, {{FSImageFormatPBINode#Loader#loadInodeFile}} is rightly getting
the EC ID from the replication field and then uses the right Policy to construct the blocks.
> *FSImageFormatPBINode#Loader#loadInodeFile*
> {noformat}
>       ErasureCodingPolicy ecPolicy = (blockType == BlockType.STRIPED) ?
>           ErasureCodingPolicyManager.getPolicyByPolicyID((byte) replication) :
>           null;
> {noformat}
> The original intention was to re-use the replication field so the in-memory representation
would be compact. But, this isn't necessary for the on-disk representation. replication is
an optional field, and if we add another optional field for the EC policy, it won't be any
extra space.
> Also, we need to make sure to have the appropriate asserts in place to make sure both
fields aren't set for the same INodeField.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message