hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5133) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
Date Tue, 10 Feb 2009 18:46:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672357#action_12672357
] 

Hairong Kuang commented on HADOOP-5133:
---------------------------------------

Next two lines of the log:
WARN  hdfs.StateChange (FSNamesystem.java:addStoredBlock(2872)) - BLOCK* NameSystem.addStoredBlock:
Redundant addStoredBlock request received for blk_2248817250507458558_1011 on 127.0.0.1:51024
size 63   
WARN  hdfs.StateChange (FSNamesystem.java:addStoredBlock(2872)) - BLOCK* NameSystem.addStoredBlock:
Redundant addStoredBlock request received for blk_2248817250507458558_1011 on 127.0.0.1:51021
size 63

blockReceived from 128.0.0.1:51021 did come. This time it did not complain about the length
but redundant addStoredBlock. The replica got added to the blocksMap but of no use because
the block was already marked as corrupt.

What's wrong here was that 128.0.0.1:51021 had a very good replica but NN wrongly marked it
as corrupt based on some stale information.

> FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
> -------------------------------------------------------------------------------
>
>                 Key: HADOOP-5133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5133
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.2
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.18.4
>
>
> Currently NameNode treats either the new replica or existing replicas as corrupt if the
new replica's length is inconsistent with NN recorded block length. The correct behavior should
be
> 1. For a block that is not under construction, the new replica should be marked as corrupt
if its length is inconsistent (no matter shorter or longer) with the NN recorded block length;
> 2. For an under construction block, if the new replica's length is shorter than the NN
recorded block length, the new replica could be marked as corrupt; if the new replica's length
is longer, NN should update its recorded block length. But it should not mark existing replicas
as corrupt. This is because NN recorded length for an under construction block does not accurately
match the block length on datanode disk. NN should not judge an under construction replica
to be corrupt by looking at the inaccurate information:  its recorded block length.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message