hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jitendra Nath Pandey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-15) All replicas of a block end up on only 1 rack
Date Thu, 13 Aug 2009 18:19:15 GMT

    [ https://issues.apache.org/jira/browse/HDFS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742916#action_12742916
] 

Jitendra Nath Pandey commented on HDFS-15:
------------------------------------------

The proposal is as follows. First 3 points are same as the approach suggested in the first
coment, except a slight change that the blocks which are not sufficiently replicated take
higher priority over the blocks that have required replicas but violate the rack requirement.
  1. Both under-replicated blocks and blocks that do not satisfy rack requirement should be
included in the neededReplication queue.
  2. neededReplication queue should have 4 priorites
      Priority 0: Blocks that have only one replicas
      Priority 1: Blocks whose number of replicas is no greater than 1/3 of it replication
factor.
      Priority 2: All other blocks which do not have required number of replicas.
      Priority 3: Blocks which have required number of replicas or more but all of them on
the same rack.
  3. In methods addStoredBlock, removeStoredBlock, startDecomission, and markBlockAsCorrupt
in FSNamesystem, put both under-replication and 1 rack blocks into the neededReplication queue.
Replicator will in addition replicate one more replicas for only 1 rack not under-replicated
blocks.
  4. If a block is in priority 3 of neededReplication queue and ReplicationTargetChooser is
unable to find a location for replica that meets the rack requirement, do not schedule a replication,
instead keep the block in the same queue. 

> All replicas of a block end up on only 1 rack
> ---------------------------------------------
>
>                 Key: HDFS-15
>                 URL: https://issues.apache.org/jira/browse/HDFS-15
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Jitendra Nath Pandey
>            Priority: Critical
>
> HDFS replicas placement strategy guarantees that the replicas of a block exist on at
least two racks when its replication factor is greater than one. But fsck still reports that
the replicas of some blocks  end up on one rack.
> The cause of the problem is that decommission and corruption handling only check the
block's replication factor but not the rack requirement. When an over-replicated block loses
a replica due to decomission, corruption, or heartbeat lost, namenode does not take any action
to guarantee that remaining replicas are on different racks.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message