Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 20701 invoked from network); 28 Mar 2008 15:39:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 28 Mar 2008 15:39:19 -0000 Received: (qmail 74163 invoked by uid 500); 28 Mar 2008 15:39:12 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 74109 invoked by uid 500); 28 Mar 2008 15:39:12 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 74090 invoked by uid 99); 28 Mar 2008 15:39:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Mar 2008 08:39:12 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [203.99.254.143] (HELO rsmtp1.corp.hki.yahoo.com) (203.99.254.143) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Mar 2008 15:38:19 +0000 Received: from comehaspaintlx (vpn-client-88-33.eglbp.corp.yahoo.com [10.66.88.33]) (authenticated bits=0) by rsmtp1.corp.hki.yahoo.com (8.13.8/8.13.8/y.rout) with ESMTP id m2SFcRVW035862 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Fri, 28 Mar 2008 08:38:31 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=from:to:references:subject:date:message-id:mime-version: content-type:content-transfer-encoding:x-mailer:thread-index:in-reply-to:x-mimeole; b=lsxEMm4V0rzzZH5vjaBzqWNOVVkNfJaiFtY4ou5gI7tHbrHwuWxG7j4DwkgLtb7Q From: "Devaraj Das" To: , References: <1206646919.32242.0.camel@JPBeast> <1206701524.3094.17.camel@latewhatgrow-lx.eglbp.corp.yahoo.com> <1206718060.7087.17.camel@JPBeast> Subject: RE: [Map/Reduce][HDFS] Date: Fri, 28 Mar 2008 21:08:36 +0530 Message-ID: <00bd01c890e9$cd2e6a50$eb44420a@ds.corp.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AciQ6J9UXHxE+i3hTqONL5F2Jg/0EgAALRZA In-Reply-To: <1206718060.7087.17.camel@JPBeast> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 X-Virus-Checked: Checked by ClamAV on apache.org Hi Jean, no that is not directly possible. You have to pass your data through the DFS client in order for that to be part of the dfs (e.g. hadoop fs -put .., etc. or programatically). (removing core-dev from this thread since this is really a core-user question) > -----Original Message----- > From: Jean-Pierre [mailto:jean-pierre.ocalan@247realmedia.com] > Sent: Friday, March 28, 2008 8:58 PM > To: core-user@hadoop.apache.org; core-dev > Subject: Re: [Map/Reduce][HDFS] > > Hello > > I'm not sure I've understood...actually I've already set this > field in the configuration file. I think this field is just > to specify the master for the HDFS. > > My problem is that I have many machines with, on each one, a > bunch of files which represent the distributed data ... and I > want to use this distribution of data with hadoop. Maybe > there is another configuration file which allow me to say to > hadoop how to use my file distribution. > Is it possible ? Should I look to adapt my distribution of > data to the hadoop one ? > > Anyway thanks for your answer Peeyush. > > On Fri, 2008-03-28 at 16:22 +0530, Peeyush Bishnoi wrote: > > hello , > > > > Yes you can do this by specify in hadoop-site.xml about the > location > > of namenode , where your data is already get distributed. > > > > --------------------------------------------------------------- > > > > fs.default.name > > > > > > --------------------------------------------------------------- > > > > Thanks > > > > --- > > Peeyush > > > > > > On Thu, 2008-03-27 at 15:41 -0400, Jean-Pierre wrote: > > > > > Hello, > > > > > > I'm working on large amount of logs, and I've noticed that the > > > distribution of data on the network (./hadoop dfs -put > input input) > > > takes a lot of time. > > > > > > Let's says that my data is already distributed among the > network, is > > > there anyway to say to hadoop to use the already existing > > > distribution ?. > > > > > > Thanks > > > > > >