hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason hadoop <jason.had...@gmail.com>
Subject Re: How can HDFS spread the data across the data nodes ?
Date Sun, 01 Feb 2009 23:56:28 GMT
If the write is taking place on a datanode, by design, 1 replica will be
written to that datanode.
The other replicas will be written to different nodes.

When you write on the namenode, it generally is not a datanode, and hadoop
will pseudo randomly allocate the replica blocks  across all of your
datanodes.

On Sun, Feb 1, 2009 at 3:09 PM, kang_min82 <kang_min82@yahoo.com> wrote:

>
> Hi everyone,
>
> I'm complete new to HDFS. Hope you guys can take a litte time to answer my
> question :).
>
> I have total 3 nodes in my cluster, one reserved for master (Namenode and
> JobTracker) and the two other nodes for slaves (Datanode).
>
> I tried to "copy" a file to HDFS with the following command:
>
> kang@vn:~/v-0.18.0$ hadoop-0.18.0/bin/hadoop fs -put test_file /
>
> If I start the command on master, HDFS spreads my file across all the name
> nodes. That should be fine ! But when I start the command on anydata node,
> HDFS doesn't spread the file, which means, the whole file is only written
> to
> this data node. Is it a bug ?
>
> My question is, how can HDFS manage something like that and which java
> class
> is involved ?
>
> I read the script bin/hadoop and know that the class FsShell.java and the
> method copyFromLocal are involved. But I don't see and know how master
> manages and decides, on which data nodes can a file be written ?
>
> Any help is appreciated, thanks so much.
>
> Kang
>
> --
> View this message in context:
> http://www.nabble.com/How-can-HDFS-spread-the-data-across-the-data-nodes---tp21781703p21781703.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message