hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koji Noguchi (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-2065) Replication policy for corrupted block
Date Tue, 16 Oct 2007 19:09:51 GMT
Replication policy for corrupted block 

                 Key: HADOOP-2065
                 URL: https://issues.apache.org/jira/browse/HADOOP-2065
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.14.1
            Reporter: Koji Noguchi

Thanks to HADOOP-1955, even if one of the replica is corrupted, the block should get replicated
from a good replica relatively fast.

Created this ticket to continue the discussion from http://issues.apache.org/jira/browse/HADOOP-1955#action_12531162.

bq. 2. Delete corrupted source replica
bq. 3. If all replicas are corrupt, stop replication.

For (2), it'll be nice if the namenode can delete the corrupted block if there's a good replica
on other nodes.

For (3), I prefer if the namenode can still replicate the block.
Before 0.14, if the file was corrupted, users were still able to pull the data and decide
if they want to delete those files. (HADOOP-2063)
In 0.14 and later, we cannot/don't replicate these blocks so they eventually get lost.

To make the matters worse, if the corrupted file is accessed, all the corrupted replicas would
be deleted except for one and stay as replication factor of 1 forever.


This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message