Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 98025 invoked from network); 15 Apr 2008 20:39:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Apr 2008 20:39:08 -0000 Received: (qmail 61914 invoked by uid 500); 15 Apr 2008 20:39:05 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 61880 invoked by uid 500); 15 Apr 2008 20:39:05 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 61871 invoked by uid 99); 15 Apr 2008 20:39:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Apr 2008 13:39:05 -0700 X-ASF-Spam-Status: No, hits=2.8 required=10.0 tests=RCVD_IN_DNSWL_LOW,RCVD_NUMERIC_HELO,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [69.50.2.13] (HELO ex9.myhostedexchange.com) (69.50.2.13) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Apr 2008 20:38:20 +0000 Received: from 206.169.1.36 ([206.169.1.36]) by ex9.hostedexchange.local ([69.50.2.13]) with Microsoft Exchange Server HTTP-DAV ; Tue, 15 Apr 2008 20:38:32 +0000 User-Agent: Microsoft-Entourage/11.3.3.061214 Date: Tue, 15 Apr 2008 13:37:40 -0700 Subject: Re: multiple datanodes in the same machine From: Ted Dunning To: Message-ID: Thread-Topic: multiple datanodes in the same machine Thread-Index: AcifOIw/yw9hfAsrEd2AcgAWy8rVfQ== In-Reply-To: <4cc657e40804151312g7dc51d16y5bfd39c21ae2d77a@mail.gmail.com> Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I have had no issues in scaling the number of datanodes. The location of the data is almost invisible to MR programs. I have had issues in going from local to distributed mode, but that has entirely been due to class path like issues. Since MR naturally restricts your focus, it is pretty much the rule that programs scale without much thought. If you test with two tasktrackers and one data node, you should have a pretty solid test environment. On 4/15/08 1:12 PM, "cagdas.gerede@gmail.com" wrote: > Testing when I do not have 10 machines. > > > On 4/15/08, Ted Dunning wrote: >> >> Why do you want to do this perverse thing? >> >> How does it help to have more than one datanode per machine? And what in >> the world is better when you have 10? >> >> >> On 4/15/08 12:53 PM, "Cagdas Gerede" wrote: >> >>> I have a follow-up question, >>> Is there a way to programatically configure datanode parameters and start >>> the datanode process? >>> If I want to create 10 datanodes on the same host, do I have to create 10 >>> config files? >>> >>> >>> On Tue, Apr 15, 2008 at 12:29 PM, dhruba Borthakur >>> wrote: >>> >>>> Yes, just point the Datanodes to different config files, different sets >>>> of ports, different data directories. Etc.etc. >>>> >>>> Thanks, >>>> dhruba >>>> >>>> -----Original Message----- >>>> From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com] >>>> Sent: Tuesday, April 15, 2008 11:21 AM >>>> To: core-user@hadoop.apache.org >>>> Subject: multiple datanodes in the same machine >>>> >>>> Is there a way to run multiple datanodes in the same machine? >>>> >>>> >>>> -- >>>> ------------ >>>> Best Regards, Cagdas Evren Gerede >>>> Home Page: http://cagdasgerede.info >>>> >>> >>> >> >> >