hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-898) Sequential generation of block ids
Date Thu, 21 Jan 2010 02:46:57 GMT

    [ https://issues.apache.org/jira/browse/HDFS-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803142#action_12803142
] 

Konstantin Shvachko commented on HDFS-898:
------------------------------------------

I ran some experiments on some large and small images as a proof of concept. Here is the table.

- First line is the number blocks in the file system. The largest I had is 40 million blocks.

- Second line is the largest hole free of block ids.
- Third line is the minimum segment that we expect to find which is calculated as the ration
2 ^64^ / num_blocks.

I don't know how to right align numbers, so I used leading zeroes, hope it is not confusing.

| Number of blocks     | 40,509,569 | 31,959,139 | 241,777 | 178,278 | 148,035 | 
| Largest segment size | 8,623,203,281,141 | 10,662,709,581,709 | 889,137,135,725,504 | 1,324,814,576,358,595
| 1,849,602,429,191,491 |
| Expected minimum     | 0,455,367,560,644 | 00,577,197,761,694 | 076,296,205,914,968 | 0,103,471,211,268,346
| 0,124,609,852,155,620 |

We see that selected segments are larger than the expected minimums and larger than 2 ^38^
= 274,877,906,944.
This speaks of the quality of the random generator, but also projects longer than 43 years
life span with the first segment we choose.

> Sequential generation of block ids
> ----------------------------------
>
>                 Key: HDFS-898
>                 URL: https://issues.apache.org/jira/browse/HDFS-898
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.20.1
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>             Fix For: 0.22.0
>
>
> This is a proposal to replace random generation of block ids with a sequential generator
in order to avoid block id reuse in the future.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message