hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7369) Erasure coding: distribute recovery work for striped blocks to DataNode
Date Thu, 12 Mar 2015 22:50:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359570#comment-14359570
] 

Zhe Zhang commented on HDFS-7369:
---------------------------------

Thanks Jing and Kai for the helpful reviews!

bq. For a striped block, I think it will be better to use BlockInfo(Striped), instead of its
individual blocks, as the basic unit for recovery. E.g., suppose we lose 2 blocks for a 6+3
EC block. For recovery, I guess we want these two blocks are recovered in a single recovery
work instead of 2.
Agreed. A striped block group should be recovered as a whole; multiple missing blocks should
be recovered on a single target first. In the new patch, an array of {{targets}} is selected,
and {{targets[0]}} is used as the DN to send recovery command to.

bq. As you mentioned in HDFS-7912, BlockManager and ReplicationMonitor never see individual
data/parity blocks currently. But it may be better to have a more strict type restriction
in UnderReplicatedBlocks, ReplicationWork, ErasureCodingWork, and computeRecoveryWorkForBlocks's
parameter.
That's a good point. I don't think there's any additional overhead of doing that. I'll rebase
after HDFS-7912.

bq. What is the case we're targeting for, is it the block recovering in stripping ec case
? If so, we need to make the title clearer, since we also have other cases for erased block
recovering in pure ec form.
Yes this patch is for recovering striped block. Updated subject.

bq. is it possible to explicitly assemble all the necessary information in a BlockGroup and
pass around to construct a ErasureCodingRecoveryWork
I think the current {{BlockCodecCommand}} has all that a DN needs for recovery, except for
the schema, which is still hard-coded.

bq. I don't see {missingBlockIdx}} is actually used.
Good catch, updated in the new patch.

One thing I'm not yet sure about is how to efficiently get the indices of missing blocks.
Maybe something like the below?
{code}
for(DatanodeStorageInfo storage : blocksMap.getStorages(block, State.FAILED)) {
  indices.add(block.getStorageBlockIndex(storage);
}
{code}

> Erasure coding: distribute recovery work for striped blocks to DataNode
> -----------------------------------------------------------------------
>
>                 Key: HDFS-7369
>                 URL: https://issues.apache.org/jira/browse/HDFS-7369
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7369-000-part1.patch, HDFS-7369-000-part2.patch, HDFS-7369-001.patch,
HDFS-7369-002.patch
>
>
> This JIRA updates NameNode to handle background / offline recovery of erasure coded blocks.
It includes 2 parts:
> # Extend {{UnderReplicatedBlocks}} to recognize EC blocks and insert them to appropriate
priority levels. 
> # Update {{ReplicationMonitor}} to distinguish block codec tasks and send a new DataNode
command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message