hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-8005) Erasure Coding: simplify striped block recovery work computation and add tests
Date Fri, 27 Mar 2015 22:36:53 GMT

     [ https://issues.apache.org/jira/browse/HDFS-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jing Zhao updated HDFS-8005:
    Attachment: HDFS-8005.000.patch

Upload the patch. The patch fixes #1 and #3, and also adds a new test to make sure the recovery
work computation/distribution is correct. The test also provides a utility function that can
create a file with striped blocks by using synthetic block reports. We can use this function
for testing NN side logic before the client side work is done.

Besides, the patch also makes the following simplification:
# Instead of recording missing block indies, we can directly capture the live/healthy block
indies. This can simplify both the NN side computation and later the DN side interpretation.
# Instead of providing precisely {{NUM_DATA_BLOCKS}} source nodes, I think we can simply providing
all the existing live nodes as sources. This can avoid the random selection logic, also it
will not bring in too much redundant information (at most k-1 sources node ). This can also
provide some flexibility to DN for later recovery. E.g., to support different EC schemas (LRC)
or to allow hedged read of source data.

To apply the patch we may need to apply HDFS-7907 first.

> Erasure Coding: simplify striped block recovery work computation and add tests
> ------------------------------------------------------------------------------
>                 Key: HDFS-8005
>                 URL: https://issues.apache.org/jira/browse/HDFS-8005
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-8005.000.patch
> HDFS-7369 adds the functionality to distribute recovery work of striped blocks to datanodes.
There are still some pending issues:
> # In {{BlockManager#chooseSourceNode}}, a node is added into {{healthyIndices}} without
checking if its block is live and healthy
> # The test {{TestRecoverStripedBlcoks#testMissingStripedBlock}} has not tested striped
blocks because the file is created before setting the storage policy
> # In {{computeRecoveryWorkForBlocks}}, instead of using {{BlockCollection#isStriped}},
we'd better use {{BlockInfo#isStriped}}

This message was sent by Atlassian JIRA

View raw message