hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Ranganathan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1094) Intelligent block placement policy to decrease probability of block loss
Date Tue, 13 Apr 2010 01:03:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856257#action_12856257
] 

Karthik Ranganathan commented on HDFS-1094:
-------------------------------------------

I think there is a slight change in the way probability should be calculated if the block
placement policy enforces that certain blocks reside on a subset of machines. Nevertheless,
I went with the probability of losing data as opposed to the expected number of block losses.

Scheme 1 - pick any machine and put blocks there. Further, assume that f = r in your example.

P(of losing data given r failures) 
= P(of losing at least 1 block) 
= 1 - P(of not losing any block)
= 1 - (P(of not losing a specific block) ^ B)
= 1 - ((1 - 1/C(N,r)) ^ B)

Scheme 2 - assume that you have a fixed pool of machines that you replicate blocks to. For
simplicity, I am going to assume what this means is that there are K machines that contain
a set of blocks and all their replicas. So there are (N/K) such sets of machines. Further,
assuming an even distribution, there are only B/(N/K) blocks in this set of K machines.

P(of losing data given r failures) 
= P(r failures being in one set of K machines) * P(of losing at least 1 block in that set)


P(r failures being in one set of K machines) = C(N/K,1)*C(K,r)/C(N,r)
P(of losing at least 1 block in that set) = 1 - ((1 - 1/C(K,r)) ^ (B/(N/K)))  --> this
follows from the fact that there are K nodes and B/(N/K) blocks.

Plugging in B=30M, N = 1000 and F = 3, r=3, K=60 (replicate all blocks in the previous and
next rack, 20 machines per rack):

Scheme 1 : P(data loss) = 1 - ((1 - 1/C(1000,3)) ^30) = 0.165
Scheme 2 : P(data loss) =  P(r failures being in one set of K machines)*P(of losing at least
1 block in that set) = 0.0034 * 1 = 0.0034

Am I doing something wrong?



> Intelligent block placement policy to decrease probability of block loss
> ------------------------------------------------------------------------
>
>                 Key: HDFS-1094
>                 URL: https://issues.apache.org/jira/browse/HDFS-1094
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The current HDFS implementation specifies that the first replica is local and the other
two replicas are on any two random nodes on a random remote rack. This means that if any three
datanodes die together, then there is a non-trivial probability of losing at least one block
in the cluster. This JIRA is to discuss if there is a better algorithm that can lower probability
of losing a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message