hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Bockelman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1094) Intelligent block placement policy to decrease probability of block loss
Date Mon, 12 Apr 2010 23:34:53 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856223#action_12856223
] 

Brian Bockelman commented on HDFS-1094:
---------------------------------------

A funny related anecdote that I've heard third-hand.  I could never trace down the authenticity,
but I found it amusing - 

A large physics experiment once tried to track down and classify as many errors in their simulation
software was possible.  After they removed the known source of errors, they took the remaining
unreproducible errors and mapped them to the worker nodes.  Then, they took the list of worker
nodes and mapped to where they were in the machine room.  Sure enough, all the unreproducible
errors could be tracked to the top two nodes in the rack.

So, if you put all the copies at the same height on the rack, the probability of losing the
files at the top of the rack is definitely higher than the probability of losing the bottom
of the rack.

> Intelligent block placement policy to decrease probability of block loss
> ------------------------------------------------------------------------
>
>                 Key: HDFS-1094
>                 URL: https://issues.apache.org/jira/browse/HDFS-1094
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The current HDFS implementation specifies that the first replica is local and the other
two replicas are on any two random nodes on a random remote rack. This means that if any three
datanodes die together, then there is a non-trivial probability of losing at least one block
in the cluster. This JIRA is to discuss if there is a better algorithm that can lower probability
of losing a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message