hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Goldstone, Robin J." <goldsto...@llnl.gov>
Subject Re: Why they recommend this (CPU) ?
Date Thu, 11 Oct 2012 19:47:15 GMT
Be sure you are comparing apples to apples.  The E5-2650 has a larger cache than the E5-2640,
faster system bus and can support faster (1600Ghz vs 1333Ghz) DRAM resulting in greater potential
memory bandwidth.


From: Patrick Angeles <patrick@cloudera.com<mailto:patrick@cloudera.com>>
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Date: Thursday, October 11, 2012 12:36 PM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: Why they recommend this (CPU) ?

If you look at comparable Intel parts:

Intel E5-2640
6 cores @ 2.5 Ghz
95W - $885

Intel E5-2650
8 cores @ 2.0 Ghz
95W - $1107

So, for $400 more on a dual proc system -- which really isn't much -- you get 2 more cores
for a 20% drop in speed. I can believe that for some scenarios, the faster cores would fare
better. Gzip compression is one that comes to mind, where you are aggressively trading CPU
for lower storage volume and IO. An HBase cluster is another example.

On Thu, Oct 11, 2012 at 3:03 PM, Russell Jurney <russell.jurney@gmail.com<mailto:russell.jurney@gmail.com>>
My own clusters are too temporary and virtual for me to notice. I haven't thought of clock
speed as having mattered in a long time, so I'm curious what kind of use cases might benefit
from faster cores. Is there a category in some way where this sweet spot for faster cores

Russell Jurney http://datasyndrome.com

On Oct 11, 2012, at 11:39 AM, Ted Dunning <tdunning@maprtech.com<mailto:tdunning@maprtech.com>>

You should measure your workload.  Your experience will vary dramatically with different computations.

On Thu, Oct 11, 2012 at 10:56 AM, Russell Jurney <russell.jurney@gmail.com<mailto:russell.jurney@gmail.com>>
Anyone got data on this? This is interesting, and somewhat counter-intuitive.

Russell Jurney http://datasyndrome.com

On Oct 11, 2012, at 10:47 AM, Jay Vyas <jayunit100@gmail.com<mailto:jayunit100@gmail.com>>

> Presumably, if you have a reasonable number of cores - speeding the cores up will be
better than forking a task into smaller and smaller chunks - because at some point the overhead
of multiple processes would be a bottleneck - maybe due to streaming reads and writes?  I'm
sure each and every problem has a different sweet spot.

View raw message