mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Scaling question
Date Sun, 13 Mar 2011 11:20:47 GMT
There's no real point in making virtual machines in order to do more
work per machine -- just make Hadoop run more workers per machine. A
good first approximation is indeed to run one worker per core.

I think you'll find a lot of Mahout-related jobs are I/O-bound, not
CPU-bound. So you may reach a bottleneck with fewer workers than that.
And then you may find you get more bang for your buck not with more
RAM or cores but more and faster disks, and getting Hadoop to use

On Sun, Mar 13, 2011 at 10:41 AM, David Stuart
<> wrote:
> Hey,
> I have done my initial tests locally  and now want to building a cluster. My question
is currently I have three big machines (32gb ram and 2 x 6 cores), would it be more effective/faster
keep the machines as is or to divide them into virtual machines and have say 6 machines per
> Regards
> David Stuart

View raw message