hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-9837) BlockManager#countNodes should be able to detect duplicated internal blocks
Date Fri, 19 Feb 2016 22:45:18 GMT
Jing Zhao created HDFS-9837:
-------------------------------

             Summary: BlockManager#countNodes should be able to detect duplicated internal
blocks
                 Key: HDFS-9837
                 URL: https://issues.apache.org/jira/browse/HDFS-9837
             Project: Hadoop HDFS
          Issue Type: Sub-task
    Affects Versions: 3.0.0
            Reporter: Jing Zhao
            Assignee: Jing Zhao


Currently {{BlockManager#countNodes}} only counts the number of replicas/internal blocks thus
it cannot detect the under-replicated scenario where a striped EC block has 9 internal blocks
but contains duplicated data/parity blocks. E.g., b8 is missing while 2 b0 exist:
b0, b1, b2, b3, b4, b5, b6, b7, b0

If the NameNode keeps running, NN is able to detect the duplication of b0 and will put the
block into the excess map. {{countNodes}} excludes internal blocks captured in the excess
map thus can return the correct number of live replicas. However, if NN restarts before sending
out the reconstruction command, the missing internal block cannot be detected anymore. The
following steps can reproduce the issue:
# create an EC file
# kill DN1 and wait for the reconstruction to happen
# start DN1 again
# kill DN2 and restart NN immediately



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message