hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@maprtech.com>
Subject Re: Spindle per Cores
Date Fri, 12 Oct 2012 20:52:59 GMT
I think that this rule of thumb is to prevent people configuring 2 disk
clusters with 16 cores or 48 disk machines with 4 cores.  Both
configurations could make sense in narrow applications, but both would most
probably be sub-optimal.

Within narrow bands, I doubt you will see huge changes.  I like to be able

a) be able to saturate disk I/O which requires some CPU and a good control.
 Different distros vary a lot here

b) have enough memory per slot.  Lots of people go cheap on this and they
wind up hamstringing performance

c) make sure there is enough CPU left over for the application.  This is
hugely app dependent, obviously.

On Fri, Oct 12, 2012 at 1:45 PM, Hank Cohen <hank.cohen@altior.com> wrote:

> What empirical evidence is there for this rule of thumb?
> In other words, what tests or metrics would indicate an optimal
> spindle/core ratio and how dependent is this on the nature of the data and
> of the map/reduce computation?
> My understanding is that there are lots of clusters with more spindles
> than cores.  Specifically, typical 2U servers can hold 12 3.5" disk drives.
>  So lots of Hadoop clusters have dual 4 core processors and 12 spindles.
>  Would it be better to have 6 core processors if you are loading up the
> boxes with 12 disks?  And most importantly, how would one know that the mix
> was optimal?
> Hank Cohen
> Altior Inc.
> -----Original Message-----
> From: Patai Sangbutsarakum [mailto:silvianhadoop@gmail.com]
> Sent: Friday, October 12, 2012 10:46 AM
> To: user@hadoop.apache.org
> Subject: Spindle per Cores
> I have read around about the hardware recommendation for hadoop cluster.
> One of them is recommend 1:1 ratio between spindle per core.
> Intel CPU come with Hyperthread which will double the number cores on one
> physical CPU. eg. 8 cores with Hyperthread it because 16 which is where we
> start to calculate about number of task slots per node.
> Once it come to spindle, i strongly believe I should pick 8 cores and
> picks 8 disks in order to get 1:1 ratio.
> Please suggest
> Patai

View raw message