hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4692) Namenode in infinite loop for replicating/deleting corrupted block
Date Tue, 13 Jan 2009 01:12:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663172#action_12663172
] 

Raghu Angadi commented on HADOOP-4692:
--------------------------------------

> BiockSender needs to know if the block reading is for block transfer or not by checking
if the client name before throwing TruncateBlockException. Would this be OK? 

I don't think so. Right now it always throws IOException. We just needs to change the exception
so that higher levels can distinguish. 

> Another question is what should BlockSender do if the on-disk block length is longer
than the NN recorded length? Currently block replication only copies the number of bytes recorded
by NN. Is this a good idea?

Copying only the bytes requested by NN is ok (as far as NN is concerned).  Similar to previous
comment, I don't think BlockSender should worry about it, but some higher level in DataNode...
I am +0 on fixing "extra data" issue. But if we want to, DataTransfer thread could check for
the right size before even creating a BlockSender. 

>  Namenode in infinite loop for replicating/deleting corrupted block
> -------------------------------------------------------------------
>
>                 Key: HADOOP-4692
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4692
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.20.0
>
>         Attachments: namenode_inconsistent_size.patch, truncateBlockReplication.patch
>
>
> Our cluster has an under-replicated block with only one replica, assuming its block id
is B. NameNode log shows that NameNode is in an infinite loop replicating/deleting the block.
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* ask DN1 to replicate blk_B to datanode(s)
DN2, DN3
> WARN org.apache.hadoop.fs.FSNamesystem: Inconsistent size for block blk_B reported from
DN2  current size is 134217728 reported size is 134205440
> WARN org.apache.hadoop.fs.FSNamesystem: Deleting block blk_B from DN2
> INFO org.apache.hadoop.dfs.StateChange: DIR* NameSystem.invalidateBlock: blk_B on DN2
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.delete: blk_B is added to invalidSet
of DN2
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated:
DN2 is added to blk_B size 134217728
> WARN org.apache.hadoop.fs.FSNamesystem: Inconsistent size for block blk_-B reported from
DN3 current size is 134217728 reported size is 134205440
> WARN org.apache.hadoop.fs.FSNamesystem: Deleting block blk_B from DN3
> INFO org.apache.hadoop.dfs.StateChange: DIR* NameSystem.invalidateBlock: blk_B on DN3
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.delete: blk_B is added to invalidSet
of DN3
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated:
DN3 is added to blk_B size 134217728
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* ask DN1 to replicate blk_B  to datanode(s)
DN4, DN5
> ...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message