hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Bockelman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4692) Namenode in infinite loop for replicating/deleting corrupted block
Date Tue, 25 Nov 2008 04:18:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650454#action_12650454
] 

Brian Bockelman commented on HADOOP-4692:
-----------------------------------------

This is a duplicate of HADOOP-3314, but this really writes up the problem better... can we
close the older ticket?

We've run into this issue locally, and it's rather debilitating because it can result in "silent
corruptions": these truncations can accumulate for a long time without anything noticing.
 If you are running with 2 replicas (hey, not all of us can afford all that raw disk space...)
and lose a data node, then this can result in a nasty surprise if the second copy had this
truncation problem.

This in fact has caused corruption for about 500 files locally.

>  Namenode in infinite loop for replicating/deleting corrupted block
> -------------------------------------------------------------------
>
>                 Key: HADOOP-4692
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4692
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Hairong Kuang
>             Fix For: 0.20.0
>
>
> Our cluster has an under-replicated block with only one replica, assuming its block id
is B. NameNode log shows that NameNode is in an infinite loop replicating/deleting the block.
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* ask DN1 to replicate blk_B to datanode(s)
DN2, DN3
> WARN org.apache.hadoop.fs.FSNamesystem: Inconsistent size for block blk_B reported from
DN2  current size is 134217728 reported size is 134205440
> WARN org.apache.hadoop.fs.FSNamesystem: Deleting block blk_B from DN2
> INFO org.apache.hadoop.dfs.StateChange: DIR* NameSystem.invalidateBlock: blk_B on DN2
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.delete: blk_B is added to invalidSet
of DN2
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated:
DN2 is added to blk_B size 134217728
> WARN org.apache.hadoop.fs.FSNamesystem: Inconsistent size for block blk_-B reported from
DN3 current size is 134217728 reported size is 134205440
> WARN org.apache.hadoop.fs.FSNamesystem: Deleting block blk_B from DN3
> INFO org.apache.hadoop.dfs.StateChange: DIR* NameSystem.invalidateBlock: blk_B on DN3
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.delete: blk_B is added to invalidSet
of DN3
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated:
DN3 is added to blk_B size 134217728
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* ask DN1 to replicate blk_B  to datanode(s)
DN4, DN5
> ...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message