hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-15) All replicas of a block end up on only 1 rack
Date Thu, 20 Aug 2009 22:33:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745703#action_12745703
] 

Raghu Angadi commented on HDFS-15:
----------------------------------

> If we go with only one list, add, remove, update, getPriority methods in UnderReplicatedBlocks.java,
will need to have another argument to pass the numberOfRacks or a boolean to denote insufficient
racks. And these methods will change to take into account the rack policy. I agree, in that
case we should rename this class.

Not required according to definition above : "Priority 3: All other under-replicated blocks."
. I.e. you don't need need the extra args. Note that underreplicated interface already takes
"expectedReplicas" as argument (required for deciding priority).

"Special treatment" is required irrespective of which list it lies in, because we were not
handling this condition before and we need to now.


> All replicas of a block end up on only 1 rack
> ---------------------------------------------
>
>                 Key: HDFS-15
>                 URL: https://issues.apache.org/jira/browse/HDFS-15
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Jitendra Nath Pandey
>            Priority: Critical
>         Attachments: HDFS-15.patch
>
>
> HDFS replicas placement strategy guarantees that the replicas of a block exist on at
least two racks when its replication factor is greater than one. But fsck still reports that
the replicas of some blocks  end up on one rack.
> The cause of the problem is that decommission and corruption handling only check the
block's replication factor but not the rack requirement. When an over-replicated block loses
a replica due to decomission, corruption, or heartbeat lost, namenode does not take any action
to guarantee that remaining replicas are on different racks.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message