hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3493) Replication is not happened for the block (which is recovered and in finalized) to the Datanode which has got the same block with old generation timestamp in RBW
Date Tue, 05 Jun 2012 18:12:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289608#comment-13289608
] 

Konstantin Shvachko commented on HDFS-3493:
-------------------------------------------

I agree with Uma this works as designed. On a three-node cluster one can manually delete the
corrupt replica, which should trigger replication.
                
> Replication is not happened for the block (which is recovered and in finalized) to the
Datanode which has got the same block with old generation timestamp in RBW
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3493
>                 URL: https://issues.apache.org/jira/browse/HDFS-3493
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.1-alpha
>            Reporter: J.Andreina
>
> replication factor= 3, block report interval= 1min and start NN and 3DN
> Step 1:Write a file without close and do hflush (Dn1,DN2,DN3 has blk_ts1)
> Step 2:Stopped DN3
> Step 3:recovery happens and time stamp updated(blk_ts2)
> Step 4:close the file
> Step 5:blk_ts2 is finalized and available in DN1 and Dn2
> Step 6:now restarted DN3(which has got blk_ts1 in rbw)
> From the NN side there is no cmd issued to DN3 to delete the blk_ts1 . But ask DN3 to
make the block as corrupt .
> Replication of blk_ts2 to DN3 is not happened.
> NN logs:
> ========
> {noformat}
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate
requested for blk_3927215081484173742 to add as corrupt on XX.XX.XX.XX:50276 by /XX.XX.XX.XX
because reported RWR replica with genstamp 1007 does not match COMPLETE block's genstamp in
block map 1008
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* processReport: from DatanodeRegistration(XX.XX.XX.XX,
storageID=DS-443871816-XX.XX.XX.XX-50276-1336829714197, infoPort=50275, ipcPort=50277, storageInfo=lv=-40;cid=CID-e654ac13-92dc-4f82-a22b-c0b6861d06d7;nsid=2063001898;c=0),
blocks: 2, processing time: 1 msecs
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* Removing block blk_3927215081484173742_1008
from neededReplications as it has enough replicas.
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate
requested for blk_3927215081484173742 to add as corrupt on XX.XX.XX.XX:50276 by /XX.XX.XX.XX
because reported RWR replica with genstamp 1007 does not match COMPLETE block's genstamp in
block map 1008
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* processReport: from DatanodeRegistration(XX.XX.XX.XX,
storageID=DS-443871816-XX.XX.XX.XX-50276-1336829714197, infoPort=50275, ipcPort=50277, storageInfo=lv=-40;cid=CID-e654ac13-92dc-4f82-a22b-c0b6861d06d7;nsid=2063001898;c=0),
blocks: 2, processing time: 1 msecs
> WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not able to
place enough replicas, still in need of 1 to reach 1
> For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> {noformat}
> fsck Report
> ===========
> {noformat}
> /file21:  Under replicated BP-1008469586-XX.XX.XX.XX-1336829603103:blk_3927215081484173742_1008.
Target Replicas is 3 but found 2 replica(s).
> .Status: HEALTHY
>  Total size:	495 B
>  Total dirs:	1
>  Total files:	3
>  Total blocks (validated):	3 (avg. block size 165 B)
>  Minimally replicated blocks:	3 (100.0 %)
>  Over-replicated blocks:	0 (0.0 %)
>  Under-replicated blocks:	1 (33.333332 %)
>  Mis-replicated blocks:		0 (0.0 %)
>  Default replication factor:	1
>  Average block replication:	2.0
>  Corrupt blocks:		0
>  Missing replicas:		1 (14.285714 %)
>  Number of data-nodes:		3
>  Number of racks:		1
> FSCK ended at Sun May 13 09:49:05 IST 2012 in 9 milliseconds
> The filesystem under path '/' is HEALTHY
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message