hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cheng xu <xcheng...@gmail.com>
Subject about how the hdfs choose datanodes to store the files
Date Thu, 05 May 2011 06:54:09 GMT
Hi:
 all! we know that the hdfs divide a large file into several blocks(with
each 64mb, 3 replications default). and  once  the metadata in the namenode
are modified, there goes a thread dataStreamer to transport the blocks to
the datanode. for each block, the client send the block to the 3 datanodes
with a pipeline.


    dfsClient.namenode.create(src, masked, dfsClient.clientName, new
EnumSetWritable<CreateFlag>(flag), createParent, replication, blockSize);
    streamer = new DataStreamer();
    streamer.start();

I just wondering how the cluster choose which datanodes to store the blocks.
what policy?
and as we know there may be plenty of blocks for a file.  and what's the
sequences is for  these blocks to be transported, cos from what I read from
the code, there is only one thread to do this from the client to the
datanodes.


any answer or url are appreciated.thanks
best regards!
xu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message