hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Boudnik <...@yahoo-inc.com>
Subject Re: how blocks are replicated
Date Mon, 16 Nov 2009 21:51:46 GMT
1. Your first quess is right - file is 'broken' into blocks which are then 
stored according to the replication policy and other things.

2. It doesn't happen automatically, as far as I know. One has to 're-balance' 
the cluster in this case.

Take care,

On 11/16/09 13:47 , Massoud Mazar wrote:
> This is probably a basic question:
> Assuming replication is set to 3, when we store a large file in HDFS, is
> the whole file stored in 3 nodes (even if you have many more nodes) or
> it is broken into blocks and each block is written to 3 nodes? (I assume
> it is the latter, so when you have 30 nodes available, each one gets a
> piece of the file, providing more performance when reading the file).
> My second question is what happens if we add more nodes to an existing
> cluster? Would any existing blocks be moved to these new nodes to expand
> the distribution of the data to new nodes?
> Thanks
> Massoud

View raw message