hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-158) dfs block ids sometimes collide
Date Mon, 01 May 2006 20:08:49 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-158?page=comments#action_12377273 ] 

Doug Cutting commented on HADOOP-158:
-------------------------------------

Using the formulas in:

http://en.wikipedia.org/wiki/Birthday_paradox#Generalisation

I think it is actually very unlikely that, with 64-bit block ids and a decent random number
generator, we are actually seeing collisions.  It seems more likely that the symptoms ascribed
to duplicate block id allocations are actually the result of other bugs.  Still, it would
be more comfortable to not rely on random block id allocation long-term.

> dfs block ids sometimes collide
> -------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.2

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id
is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file
is already using the newly allocated id, and, if it is, generate another one.  There can still
be collisions in some rare conditions, but these are harder to fix and will wait, since this
simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message