Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates
 216.139.236.158 as permitted sender)
Message-ID: <21781703.post@talk.nabble.com>
Date: Sun, 1 Feb 2009 15:09:00 -0800 (PST)
From: kang_min82 <kang_min82@yahoo.com>
To: core-user@hadoop.apache.org
Subject: How can HDFS spread the data across the data nodes ?
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


Hi everyone, 

I'm complete new to HDFS. Hope you guys can take a litte time to answer my
question :).

I have total 3 nodes in my cluster, one reserved for master (Namenode and
JobTracker) and the two other nodes for slaves (Datanode).

I tried to "copy" a file to HDFS with the following command:

kang@vn:~/v-0.18.0$ hadoop-0.18.0/bin/hadoop fs -put test_file /

If I start the command on master, HDFS spreads my file across all the name
nodes. That should be fine ! But when I start the command on anydata node,
HDFS doesn't spread the file, which means, the whole file is only written to
this data node. Is it a bug ?

My question is, how can HDFS manage something like that and which java class
is involved ? 

I read the script bin/hadoop and know that the class FsShell.java and the
method copyFromLocal are involved. But I don't see and know how master
manages and decides, on which data nodes can a file be written ?

Any help is appreciated, thanks so much.

Kang

-- 
View this message in context: http://www.nabble.com/How-can-HDFS-spread-the-data-across-the-data-nodes---tp21781703p21781703.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.