hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8728) Erasure coding: revisit and simplify BlockInfoStriped and INodeFile
Date Thu, 16 Jul 2015 21:01:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630321#comment-14630321

Zhe Zhang commented on HDFS-8728:

Thanks Andrew again for the review! It's really helpful.

I'll start from the high level comments:
bq. doesn't ECSchema belong in the INode rather than each BlockInfo, considering each BlockInfo
in a file will have the same schema?
We've [discussed | https://issues.apache.org/jira/browse/HDFS-8059?focusedCommentId=14482080&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14482080]
the idea under HDFS-8059. It's an interesting design question how to efficiently store all
EC-related information. For example, {{ECSchema}} logically belongs to a file, but the BM
module needs to access the info much more frequently than NN. There's a tradeoff between time,
memory space, and configuration flexibility; maybe we should keep it open until observing
more usage patterns.

bq. Having the isStriped ternaries and if statements everywhere is ugly, can we push some
of this logic down into the block itself?
Good point. Basically we can generalize the *replication* concept to *redundancy*. The patch
for HDFS-8697 includes the following denotations. We should also think about how to quantify
the redundancy concept. 
 * The following terms are used to denote different
 * levels of a block's storage redundancy:
 *   1. Fully-redundant / full redundancy: A block is stored with the right
 *      number of replicas or unique internal blocks as specified.
 *   2. Partially-redundant / partial redundancy: A block has fewer replicas /
 *      unique internal blocks than specified.
 *   3. Not-redundant / zero redundancy: A contiguous block is stored with one 
 *      replica or a striped block is stored with dataBlockNum unique internal
 *      blocks. Losing any replica / internal block will corrupt the block.
 *   4. Overly-redundant / over redundancy: A block has more replicas / internal
 *      blocks than specified full-redundancy.
Based on that generalization, we can unify contiguous / striped logics in many places:
# {{BlockManager#getMinStorageNum()}} should be pushed to something like {{BlockInfo#getMinStorages()}}
# {{BlockManager#minReplication}} can be renamed to {{minRedundancy}} and the default should
be 0 instead of 1. Similar for {{FSNamesystem#safeReplication}}. Then we can compare the actual
number of storages with {{BlockInfo#getMinStorages() + minRedundancy}}

bq. Basically, wherever I see a use of getStripedBlockStorageOp, I question whether it could
be pushed down into a BlockInfo subclass. Ideally the Op wouldn't need a getter.
This is related to the {{getOp()}} idea discussed above. The challenge is still around {{BlockInfo}}
multi-inheritance. When we call {{getStripedBlockStorageOp}} we expect the instance to be
either {{BIStriped}} or {{BIUCStriped}}.

bq. There are lots of places which will be broken by a change to *EC+repl*
Agreed. Striping+Repl will bring in another dimension of complexity. I think at this stage
we should focus on invalidating all secondary replicas of an internal block, and calculating
redundancy purely based on the number of unique internal blocks. In the next rev I will document
this both in {{BIStriped}} and in relevant places in {{BlockManager}}.

bq. persisting the EC schema to the INode
Agreed. Currently we are using hard-coded default schema in some places and configured ones
elsewhere. [~kaisasak] has been working on removed hard-coded schema and cellSize in all places.
Maybe we should move HDFS-8494 and HDFS-8062 back to HDFS-7285?

> Erasure coding: revisit and simplify BlockInfoStriped and INodeFile
> -------------------------------------------------------------------
>                 Key: HDFS-8728
>                 URL: https://issues.apache.org/jira/browse/HDFS-8728
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8728-HDFS-7285.00.patch, HDFS-8728-HDFS-7285.01.patch, HDFS-8728-HDFS-7285.02.patch,
HDFS-8728-HDFS-7285.03.patch, HDFS-8728.00.patch, HDFS-8728.01.patch, HDFS-8728.02.patch,
Merge-1-codec.patch, Merge-2-ecZones.patch, Merge-3-blockInfo.patch, Merge-4-blockmanagement.patch,
Merge-5-blockPlacementPolicies.patch, Merge-6-locatedStripedBlock.patch, Merge-7-replicationMonitor.patch,

This message was sent by Atlassian JIRA

View raw message