hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jitendra Nath Pandey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-15) All replicas of a block end up on only 1 rack
Date Thu, 13 Aug 2009 18:19:15 GMT

    [ https://issues.apache.org/jira/browse/HDFS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742916#action_12742916

Jitendra Nath Pandey commented on HDFS-15:

The proposal is as follows. First 3 points are same as the approach suggested in the first
coment, except a slight change that the blocks which are not sufficiently replicated take
higher priority over the blocks that have required replicas but violate the rack requirement.
  1. Both under-replicated blocks and blocks that do not satisfy rack requirement should be
included in the neededReplication queue.
  2. neededReplication queue should have 4 priorites
      Priority 0: Blocks that have only one replicas
      Priority 1: Blocks whose number of replicas is no greater than 1/3 of it replication
      Priority 2: All other blocks which do not have required number of replicas.
      Priority 3: Blocks which have required number of replicas or more but all of them on
the same rack.
  3. In methods addStoredBlock, removeStoredBlock, startDecomission, and markBlockAsCorrupt
in FSNamesystem, put both under-replication and 1 rack blocks into the neededReplication queue.
Replicator will in addition replicate one more replicas for only 1 rack not under-replicated
  4. If a block is in priority 3 of neededReplication queue and ReplicationTargetChooser is
unable to find a location for replica that meets the rack requirement, do not schedule a replication,
instead keep the block in the same queue. 

> All replicas of a block end up on only 1 rack
> ---------------------------------------------
>                 Key: HDFS-15
>                 URL: https://issues.apache.org/jira/browse/HDFS-15
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Jitendra Nath Pandey
>            Priority: Critical
> HDFS replicas placement strategy guarantees that the replicas of a block exist on at
least two racks when its replication factor is greater than one. But fsck still reports that
the replicas of some blocks  end up on one rack.
> The cause of the problem is that decommission and corruption handling only check the
block's replication factor but not the rack requirement. When an over-replicated block loses
a replica due to decomission, corruption, or heartbeat lost, namenode does not take any action
to guarantee that remaining replicas are on different racks.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message