hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Bockelman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1094) Intelligent block placement policy to decrease probability of block loss
Date Mon, 12 Apr 2010 19:47:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856139#action_12856139
] 

Brian Bockelman commented on HDFS-1094:
---------------------------------------

Hey Karthik,

Let me play dumb (it might not be playing after all) and try to work out the math a bit.

First, let's assume that on any given day, a node has 1/1000 chance of failing.

CURRENT SCHEME: A block is on 3 random nodes.  Probability of loss is a simultaneous failure
of nodes X, Y, Z.  Let's assume these are independent.  P(X and Y and Z) = P(X) P(Y) P(Z)
= 1 in a billion.

PROPOSED SCHEME:  Well, the probability is the same.

So, given a specific block, we don't change the probability it is lost.

What you seem to be calculating is the probability that three nodes go down out of N nodes:

P(nodes X, Y, and Z fail for any three distinct X, Y, Z) = 1 - P(N-3 nodes stay up) = 1 -
[999/1000]^[N-3]

Sure enough, if you use a small subset (N=40 maybe), then the probability of 3 nodes failing
is smaller for small subsets than the whole cluster.

However, that's not the number you want!  You want the probability that *any* block is lost
when three nodes go down.  That is, P(nodes X, Y, and Z fail for any three distinct X, Y,
Z and X, Y, Z have at least one distinct block) (call this P_1).  Assuming that overlapping
blocks, node death, and subset of nodes are all independent, you get:

P_1 = P(three nodes having at least one common block) * P(3 node death) * (# of distinct 3-node
subsets)

The first number is decreasing with N, the second is constant with N, the third is increasing
with N.  The third is a well-known formula, while I don't have a good formula for the first
value.  Unless you can calculate or estimate the first, I don't think you can really say anything
about decreasing the value of P_1.

I *think* we are incorrectly assuming the probability of data loss as being proportional to
to the probability of 3 machines in a subset being lost without taking into account the probability
of common blocks.  The probabilities get tricky, hence me asking for someone to sketch it
out mathematically... 



> Intelligent block placement policy to decrease probability of block loss
> ------------------------------------------------------------------------
>
>                 Key: HDFS-1094
>                 URL: https://issues.apache.org/jira/browse/HDFS-1094
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The current HDFS implementation specifies that the first replica is local and the other
two replicas are on any two random nodes on a random remote rack. This means that if any three
datanodes die together, then there is a non-trivial probability of losing at least one block
in the cluster. This JIRA is to discuss if there is a better algorithm that can lower probability
of losing a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message