hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Hardware inquiry
Date Fri, 05 Feb 2010 16:13:16 GMT
Hi Justin,

The best overall balanced machine in my experience, if you're buying
today, is a dual quad core recent processor, 4-6x1TB 7200 RPM SATA
disks, and 16-24G RAM. It sounds beefy, but it hits a good
price/performance/power sweet spot - you can still get this in 1U and
hence pack a lot of punch in each rack.

Depending on your problem and data, you can bump up RAM, disk, or CPU.
For example, some people are running archival clusters with 12 or even
24 disks per node, or using 1.5TB/2TB disks on each node.


On Wed, Feb 3, 2010 at 5:14 PM, Justin Becker <becker.justin@gmail.com> wrote:
> My organization has decided to make a substantial investment in hardware for
> processing Hadoop jobs.  Our cluster will be used by multiple groups so its
> hard to classify the problems as IO, memory, or CPU bound.  Would others be
> willing to share their hardware profiles coupled with the problem types
> (memory, cpu, etc.).  Our current setup, for the existing cluster is made up
> of the following machines,
> Poweredge 1655
> 2x2 Intel Xeon 1.4ghz
> 72GB local HD
> Poweredge 1855
> 2x2 Intel Xeon 3.2ghz
> 146GB local HD
> Poweredge 1955
> 2x2 Intel Xeon 3.0ghz
> 72GB local HD
> Obviously, we would like to increase local disk space, memory, and the
> number of cores.  The not-so-obvious decision is wether to select high end
> equipment (fewer machines) or lower-class hardware.  We're trying to balance
> "how commodity" against the administration costs.  I've read the machine
> scaling material on the Hadoop wiki.  Any additional real-world advice would
> be awesome.
> Thanks,
> Justin

View raw message