hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8728) Erasure coding: revisit and simplify BlockInfoStriped and INodeFile
Date Thu, 16 Jul 2015 00:43:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629006#comment-14629006

Andrew Wang commented on HDFS-8728:

Hi Zhe, some more comments, this time while reading through patch #4. Again not all comments
are directly related to the patch.

BlockManager nits:
* Nit: "For a striped block to invalidate" -> "When invalidating a striped block".
* Nit: needs a line break: "   * @return Returns the value of whether there are any non-EC
blocks using StripedID."

BlockManager higher-level:
* At a high-level, doesn't ECSchema belong in the INode rather than each BlockInfo, considering
each BlockInfo in a file will have the same schema? I did a quick check and it seems like
we often have access to the BlockCollection when we need schema information.
* Having the isStriped ternaries and if statements everywhere is ugly, can we push some of
this logic down into the block itself? It'd be better to see for example a straightforward
{{minStorages = blockInfo.getMinStorages()}}. Basically, wherever I see a use of getStripedBlockStorageOp,
I question whether it could be pushed down into a BlockInfo subclass. Ideally the Op wouldn't
need a getter.
* There are lots of places which will be broken by a change to EC+repl, since the assumption
is that {{# internal blocks == # replicas}}. Generally speaking, I don't like how internal
blocks are sometimes treated as blocks and sometimes as replicas, since it leads to special-casey
{{if (isStriped())}} logic everywhere. At a minimum, this really needs to be documented somewhere
so users of BlockInfo know what to do.
* I would also like everything that requires changing NN metadata finished before merging
to trunk, which right now includes at least persisting the EC schema to the INode.

BlockManager detail:
* In this snippet, why do we use getDataBlockNum vs. getRealDataBlockNum?

    final int minStorage = curBlock.isStriped() ?
        curBlock.getStripedBlockStorageOp().getDataBlockNum() : minReplication;

* In getStoredBlock, the logic could be deduped by inverting the if condition as follows:

    if (hasNonEcBlockUsingStripedID) {
        BlockInfo info = blocksMap.getStoredBlock(block);
        if (info != null) {
        return info;
    return blocksMap.getStoredBlock(
        new Block(BlockIdManager.convertToStripedID(block.getBlockId())));

* I see places using HdfsConstants.NUM_DATA_BLOCKS and NUM_PARITY_BLOCKS, i.e. FSDirWriteFileOp#validateAddBlock.
These should be updated to use the file's schema right? This relates to HDFS-7859 since right
now we're using a default schema everywhere.
* We also need to be sure to dedupe the ECSchemas in memory when we implement this, we don't
want one per INode or per BlockInfo.
* chooseExcessReplicasStriped, the javadoc needs a grammar pass. I'd be nice to think about
sharing logic with the Contiguous function too; might be more possible in an EC+repl world.
* chooseExcessReplicasStriped and chooseExcessReplicasContiguous should possibly be named
*Replicates* rather than *Replicas* based on the parent function, or we rename the parent.
I favor the latter, since usage of "replica" is more common than "replicate" based on my knowledge
of the code base.

BlockManager EC+repl issues:
* The safeReplication minimum is configurable, so we should respect it in incrementSafeBlockCount.
This is essentially an EC+repl issue; each internal block needs to be at least safeReplication
* The total block count (i.e. {{blocksMap.size()}} isn't going to reflect each internal block
in a BlockInfoStriped, which is not accurate in an EC+repl world. It counts BlockInfos rather
than Blocks.
* In PendingReplicationBlocks and UnderReplicatedBlocks we probably have the same issue with
EC+repl. InvalidateBlocks denormalizes the BlockInfoStriped into its component internal Blocks,
so might be okay. Possibly other places I haven't checked yet too.
* countNodes probably will need an update for EC+repl too.
* checkReplication, also essentially a EC+repl issue. We broke the assumption that all the
BlockInfos in a BlockCollection have the same replication factor. Would be good to document
this fact somewhere in BlockInfoStriped.

* Regarding the below snippet, it's attempting to increase priority if we have less than 33%
extra parity, but this falls apart for schemes like (3,2) since having only one parity still
leaves us at 50%. Should probably include a special case for "only one extra block left",
or do some rounding. Need to fix the comment too.
    } else if ((curReplicas - dataBlkNum) * 3 < parityBlkNum + 1) {
      // can only afford one replica loss

> Erasure coding: revisit and simplify BlockInfoStriped and INodeFile
> -------------------------------------------------------------------
>                 Key: HDFS-8728
>                 URL: https://issues.apache.org/jira/browse/HDFS-8728
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8728-HDFS-7285.00.patch, HDFS-8728-HDFS-7285.01.patch, HDFS-8728-HDFS-7285.02.patch,
HDFS-8728-HDFS-7285.03.patch, HDFS-8728.00.patch, HDFS-8728.01.patch, HDFS-8728.02.patch,
Merge-1-codec.patch, Merge-2-ecZones.patch, Merge-3-blockInfo.patch, Merge-4-blockmanagement.patch,
Merge-5-blockPlacementPolicies.patch, Merge-6-locatedStripedBlock.patch, Merge-7-replicationMonitor.patch,

This message was sent by Atlassian JIRA

View raw message