hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
Date Mon, 18 Jun 2007 17:44:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505899
] 

Doug Cutting commented on HADOOP-158:
-------------------------------------

> At some point, the system will have gone through a trillion file creation events [ ...
]

We generally aim for a block created per drive no more than every 100 milliseconds, so that
transfer dominates seek.  With 10,000 nodes, each with four drives, that would give a maximum
block creation rate of 400k/second (assuming a replication level of one).  At that rate it
would take 100,000 years to exhaust all 64-bit block ids.  I wonder what version Hadoop will
have then?


> dfs should allocate a random blockid range to a file, then assign ids sequentially to
blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-158
>                 URL: https://issues.apache.org/jira/browse/HADOOP-158
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.1.0
>            Reporter: Doug Cutting
>            Assignee: Konstantin Shvachko
>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id
is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file
is already using the newly allocated id, and, if it is, generate another one.  There can still
be collisions in some rare conditions, but these are harder to fix and will wait, since this
simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message