Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 44550 invoked from network); 13 Jul 2007 16:27:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Jul 2007 16:27:30 -0000 Received: (qmail 30293 invoked by uid 500); 13 Jul 2007 16:27:30 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 30255 invoked by uid 500); 13 Jul 2007 16:27:30 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 30230 invoked by uid 99); 13 Jul 2007 16:27:30 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jul 2007 09:27:29 -0700 X-ASF-Spam-Status: No, hits=1.3 required=10.0 tests=RCVD_NUMERIC_HELO X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [69.50.2.13] (HELO ex9.myhostedexchange.com) (69.50.2.13) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jul 2007 09:27:25 -0700 Received: from 75.80.179.210 ([75.80.179.210]) by ex9.hostedexchange.local ([69.50.2.13]) with Microsoft Exchange Server HTTP-DAV ; Fri, 13 Jul 2007 16:27:04 +0000 User-Agent: Microsoft-Entourage/11.3.3.061214 Date: Fri, 13 Jul 2007 09:27:00 -0700 Subject: Re: Client Scaling From: Ted Dunning To: Message-ID: Thread-Topic: Client Scaling Thread-Index: AcfFaP1IO5AkJDFcEdy8vwANk7UzfgAAaYBG In-Reply-To: Mime-version: 1.0 Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org 100 new files / 10 minutes =3D 0.17 files per second. Any namenode at all should be able to handle this in its sleep. 100 clients x 10 MB / 10 minutes =3D 1GB / 600 seconds =3D 1.7 MB/s This is actually very low aggregate bandwidth. That many clients should be able to handle this very easily. Depending on what you need to do to the data, you should be able to handle this with only 10 (possibly fewer). On 7/13/07 9:15 AM, "Marco Nicosia" wrote: > In general, I bet 100 clients should be no problem if you have a reasonab= le > number of dataNodes, especially if the client operations are not > simultaneous. >> How well does Hadoop scale for multiple client inputs? For instance, cou= ld a >> reasonably powerful namenode handle 100 client machines copying in 10 MB >> every >> 10 minutes? Assume all of the clients would be running a wrapper around = the >> "copyFromLocalFile" method. >>=20 >> Thanks, >>=20 >> Stu Hood >> Webmail.us >> "You manage your business. We'll manage your email."=AE