Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 51950 invoked from network); 1 Feb 2009 23:09:34 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 1 Feb 2009 23:09:34 -0000 Received: (qmail 58479 invoked by uid 500); 1 Feb 2009 23:09:28 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 58443 invoked by uid 500); 1 Feb 2009 23:09:28 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 58432 invoked by uid 99); 1 Feb 2009 23:09:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Feb 2009 15:09:28 -0800 X-ASF-Spam-Status: No, hits=4.0 required=10.0 tests=DNS_FROM_OPENWHOIS,FORGED_YAHOO_RCVD,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Feb 2009 23:09:21 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1LTlQy-0002ME-DS for core-user@hadoop.apache.org; Sun, 01 Feb 2009 15:09:00 -0800 Message-ID: <21781703.post@talk.nabble.com> Date: Sun, 1 Feb 2009 15:09:00 -0800 (PST) From: kang_min82 To: core-user@hadoop.apache.org Subject: How can HDFS spread the data across the data nodes ? MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: kang_min82@yahoo.com X-Virus-Checked: Checked by ClamAV on apache.org Hi everyone, I'm complete new to HDFS. Hope you guys can take a litte time to answer my question :). I have total 3 nodes in my cluster, one reserved for master (Namenode and JobTracker) and the two other nodes for slaves (Datanode). I tried to "copy" a file to HDFS with the following command: kang@vn:~/v-0.18.0$ hadoop-0.18.0/bin/hadoop fs -put test_file / If I start the command on master, HDFS spreads my file across all the name nodes. That should be fine ! But when I start the command on anydata node, HDFS doesn't spread the file, which means, the whole file is only written to this data node. Is it a bug ? My question is, how can HDFS manage something like that and which java class is involved ? I read the script bin/hadoop and know that the class FsShell.java and the method copyFromLocal are involved. But I don't see and know how master manages and decides, on which data nodes can a file be written ? Any help is appreciated, thanks so much. Kang -- View this message in context: http://www.nabble.com/How-can-HDFS-spread-the-data-across-the-data-nodes---tp21781703p21781703.html Sent from the Hadoop core-user mailing list archive at Nabble.com.