hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arpit Agarwal <aagar...@hortonworks.com>
Subject Re: Why do sequential block IDs begin from 2^30?
Date Wed, 31 Aug 2016 12:16:35 GMT
Hi Ewan,

It’s not the low 30 bits that are reserved. Just the first 2^30 block IDs. Block IDs are
longs so we are unlikely to ever run out.

The choice was arbitrary and not based on likelihood of collision with earlier block IDs.



On 8/25/16, 2:38 AM, "Ewan Higgs" <Ewan.Higgs@hgst.com> wrote:

    Hi all,
    I see in o.a.h.hdfs.server.blockmanagement.SequentialBlockIdGenerator that the low 30
bits of the Block ID are reserved. This was set out in HDFS-4645 [1,2]:
    
    “””
    We do not change the block ID of any existing blocks on upgrade. Such existing blocks
whose IDs were randomly generated are subsequently referred to as legacy blocks.
    
    Henceforth block IDs will be allocated sequentially starting from a fixed constant e.g.
2^30.
    “””
    
    This doesn’t really follow since a uniform distribution wouldn’t have made block IDs
all that likely to have populated those low 30 bits. My only guess is that the pseudorandom
number generator in the legacy block ID generation was not uniform across the 64 bit block
ID space.
    
    In the Jira, Suresh suggested only 16 bits:
    
    “””
    I think we could reserve few block IDs say 0-65535 and start generating from 65535. When
it reaches some max, we could rollover to negative numbers. That is a decision that can be
made in the future.
    “””
    
    So I’m curious why the initial range was skipped if the pseudorandom number block ID
generator wouldn’t really have favoured the low range of block IDs. And why was the initial
range of 16 bits changed to skip the initial 30 bits?
    
    Thanks for the help in understanding this!
    
    Yours,
    Ewan
    
    [1] https://issues.apache.org/jira/browse/HDFS-4645
    [2]  https://issues.apache.org/jira/secure/attachment/12589172/SequentialblockIDallocation.pdf
    
    Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice &
Disclaimer:
    
    This e-mail and any files transmitted with it may contain confidential or legally privileged
information of WDC and/or its affiliates, and are intended solely for the use of the individual
or entity to which they are addressed. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited.
If you have received this e-mail in error, please notify the sender immediately and delete
the e-mail in its entirety from your system.
    


Mime
View raw message