hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3119) Overreplicated block is not deleted even after the replication factor is reduced after sync follwed by closing that file
Date Tue, 27 Mar 2012 18:02:29 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239737#comment-13239737

Uma Maheswara Rao G commented on HDFS-3119:

Actual problem is, we set the replication factor down from 2 to 1 and close the file.

if complete call success with min replication factor 1 and after this only other DN's addStored
blocks request comes, then that call can process the OverReplicated blocks. Because file might
have moved already from FileUnderConstruction to finalized.

The other case is, if the complete call success with 2 addStored blocks immediately before
moving fileInodeUnderConstruction to finalized one, then no one will be there to process the
overreplicated blocks.

I feel the solution for this problem should be that, we have to add overreplicated check in
BlockManager#checkReplication method. This will be called on complete file.

current code is checking only neededReplications.
 public void checkReplication(Block block, int numExpectedReplicas) {
    // filter out containingNodes that are marked for decommission.
    NumberReplicas number = countNodes(block);
    if (isNeededReplication(block, numExpectedReplicas, number.liveReplicas())) { 
> Overreplicated block is not deleted even after the replication factor is reduced after
sync follwed by closing that file
> ------------------------------------------------------------------------------------------------------------------------
>                 Key: HDFS-3119
>                 URL: https://issues.apache.org/jira/browse/HDFS-3119
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.24.0
>            Reporter: J.Andreina
>            Priority: Minor
>             Fix For: 0.24.0, 0.23.2
> cluster setup:
> --------------
> 1NN,2 DN,replication factor 2,block report interval 3sec ,block size-256MB
> step1: write a file "filewrite.txt" of size 90bytes with sync(not closed) 
> step2: change the replication factor to 1  using the command: "./hdfs dfs -setrep 1 /filewrite.txt"
> step3: close the file
> * At the NN side the file "Decreasing replication from 2 to 1 for /filewrite.txt" , logs
has occured but the overreplicated blocks are not deleted even after the block report is sent
from DN
> * while listing the file in the console using "./hdfs dfs -ls " the replication factor
for that file is mentioned as 1
> * In fsck report for that files displays that the file is replicated to 2 datanodes

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message