Return-Path: Delivered-To: apmail-mahout-user-archive@www.apache.org Received: (qmail 31521 invoked from network); 13 Mar 2011 11:21:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Mar 2011 11:21:16 -0000 Received: (qmail 73931 invoked by uid 500); 13 Mar 2011 11:21:15 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 73896 invoked by uid 500); 13 Mar 2011 11:21:15 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 73888 invoked by uid 99); 13 Mar 2011 11:21:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Mar 2011 11:21:15 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of srowen@gmail.com designates 209.85.214.170 as permitted sender) Received: from [209.85.214.170] (HELO mail-iw0-f170.google.com) (209.85.214.170) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Mar 2011 11:21:09 +0000 Received: by iwn3 with SMTP id 3so7827779iwn.1 for ; Sun, 13 Mar 2011 04:20:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=5Yy1SXW3PRqINjnxZCWOKa4rCcQB/RmkxR8UQvKgmUo=; b=vYb4Tvzg/VBjaapBAqgcB0QYN5acQqyryL+EC+ALBsjOX4E2A15j5vPbAgIOyQGXLC a0JQDExW79sBlQB4cZ3MDwnUGxETRqVdG1E660x4IbEEq88VyAGJWJ9BR9iFpWI6Xs5m MXjY7ub3UwYdtoB/Ifuk+xiMV1D3YZiVJuJKE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=wNgS3F6E6nXGVUUDEyzEYyDG/VGjCyKhKh8BwG/FtUiTYSmftXc9YG+LZ1Z4+GGQR9 PTrzgxNo5Jd+3dMouJm89s1G/CiFipzr4bOIiO0HQSAhhSGbwyaLq5xPcI9pNk1iEhRl N4MUzmRXyEIc09fB1XCPv62duRhnYzCX4BLzY= MIME-Version: 1.0 Received: by 10.42.130.198 with SMTP id w6mr840245ics.496.1300015247945; Sun, 13 Mar 2011 04:20:47 -0700 (PDT) Received: by 10.231.144.3 with HTTP; Sun, 13 Mar 2011 04:20:47 -0700 (PDT) In-Reply-To: References: Date: Sun, 13 Mar 2011 11:20:47 +0000 Message-ID: Subject: Re: Scaling question From: Sean Owen To: user@mahout.apache.org Cc: David Stuart Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org There's no real point in making virtual machines in order to do more work per machine -- just make Hadoop run more workers per machine. A good first approximation is indeed to run one worker per core. I think you'll find a lot of Mahout-related jobs are I/O-bound, not CPU-bound. So you may reach a bottleneck with fewer workers than that. And then you may find you get more bang for your buck not with more RAM or cores but more and faster disks, and getting Hadoop to use them. On Sun, Mar 13, 2011 at 10:41 AM, David Stuart wrote: > Hey, > > I have done my initial tests locally =C2=A0and now want to building a clu= ster. My question is currently I have three big machines (32gb ram and 2 x = 6 cores), would it be more effective/faster keep the machines as is or to d= ivide them into virtual machines and have say 6 machines per Server. > > Regards > > David Stuart