hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
Date Sat, 28 Feb 2015 00:16:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341074#comment-14341074

Arpit Agarwal commented on HDFS-7836:

bq. 1 M blocks per disk, on 10 disks, and 24 bytes per block, is a 240 MB block report (did
I do that math right?) That's definitely bigger than we'd like the full BR RPC to be, and
compression can help here. Or possibly separating the block report into multiple RPCs. Perhaps
one RPC per storage?
We do use one RPC per storage when the block count is over 1M viz. {{DFS_BLOCKREPORT_SPLIT_THRESHOLD_DEFAULT}}.
The math doesn't work since protobuf uses vint on the wire. _9M blocks ~ 64MB_ was seen empirically
in a couple of different deployments. It was used as the basis for the default of 1M.

bq. Hmm. Our sequential block allocations should guarantee that mod N produces an approximately
equal number of blocks in each stripe. It is only with randomly allocated block IDs that we
could even theoretically get an imbalance (although the probability is vanishingly small even
there if the randomness is uniform.). With sequentially allocated block IDs the stripes will
always be of equal size. I guess deletions of blocks could change that, but I see no reason
why any group of blocks mod N should be more deleted than another group.
With sequential allocation, a job with that does 'create N files, delete M files, repeat'
could cause that imbalance over time.

> BlockManager Scalability Improvements
> -------------------------------------
>                 Key: HDFS-7836
>                 URL: https://issues.apache.org/jira/browse/HDFS-7836
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>         Attachments: BlockManagerScalabilityImprovementsDesign.pdf
> Improvements to BlockManager scalability.

This message was sent by Atlassian JIRA

View raw message