hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2065) Replication policy for corrupted block
Date Sat, 10 May 2008 02:21:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595777#action_12595777

Konstantin Shvachko commented on HADOOP-2065:

This looks correct.
I have a concern that we probably need to move this into a separate data-structure. So that
not to refactor it later as we did with other block collections. So,
- I'd highly recommend to create a separate class say CorruptedReplicaMap or CorruptedReplicas
or CorruptedBlocks.
The important thing that all members and methods related to corrupted replicas currently populated
in FSNamesystem like invalidateCorruptReplicas(), markBlockAsCorrupt()  should belong to this
class as well as add(), get(), remove(), isCorrupt().
This is simialr to PendingReplicationBlocks, BlocksMap and UnderReplicatedBlocks.
- The method of the new class should clearly distinguish between Block and BlockInfo parameters.
I presume most parameters will be BlockInfo, because this collection holds only those blocks
that belong to the BlocksMap.
- processPendingReplications()
  if (num.liveReplicas == 0) continue;    is not necessary, because neededReplications.add()
does that inside.
  This code should just be removed.

> Replication policy for corrupted block 
> ---------------------------------------
>                 Key: HADOOP-2065
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2065
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>            Reporter: Koji Noguchi
>            Assignee: lohit vijayarenu
>             Fix For: 0.18.0
>         Attachments: HADOOP-2065-2.patch, HADOOP-2065-3.patch, HADOOP-2065-4.patch, HADOOP-2065-5.patch,
HADOOP-2065-6.patch, HADOOP-2065.patch
> Thanks to HADOOP-1955, even if one of the replica is corrupted, the block should get
replicated from a good replica relatively fast.
> Created this ticket to continue the discussion from http://issues.apache.org/jira/browse/HADOOP-1955#action_12531162.
> bq. 2. Delete corrupted source replica
> bq. 3. If all replicas are corrupt, stop replication.
> For (2), it'll be nice if the namenode can delete the corrupted block if there's a good
replica on other nodes.
> For (3), I prefer if the namenode can still replicate the block.
> Before 0.14, if the file was corrupted, users were still able to pull the data and decide
if they want to delete those files. (HADOOP-2063)
> In 0.14 and later, we cannot/don't replicate these blocks so they eventually get lost.
> To make the matters worse, if the corrupted file is accessed, all the corrupted replicas
would be deleted except for one and stay as replication factor of 1 forever.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message