Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B3AF1DEE8 for ; Sat, 13 Oct 2012 07:22:39 +0000 (UTC) Received: (qmail 38991 invoked by uid 500); 13 Oct 2012 07:22:34 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 38649 invoked by uid 500); 13 Oct 2012 07:22:27 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 38607 invoked by uid 99); 13 Oct 2012 07:22:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Oct 2012 07:22:26 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of russell.jurney@gmail.com designates 209.85.216.41 as permitted sender) Received: from [209.85.216.41] (HELO mail-qa0-f41.google.com) (209.85.216.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Oct 2012 07:22:22 +0000 Received: by mail-qa0-f41.google.com with SMTP id p27so147774qat.14 for ; Sat, 13 Oct 2012 00:22:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=L4ABvwd+UpcLEntC+lcmH3/svlRlv6cKNGhmR6jnvJ4=; b=vH6/uk4YeshI6uvh6ip7GqUq3XvcujNwYlCbIMwYEktcHvoeD0tV5qI+cZUCSJElp8 HdqiIPy2mQ9ou7EPd76RwQhmNcJD91SnTeSNJ2sNbMOgSDTab+Dyp8CNnA0hm1mFcXiW ULI0CexmkBz9SPUUzmo3NqNB1XrvfctB+BrrbxCyZ2Oi3NcMZCnjTgwf9wrjD4XdGfxg eXxctiyQYQcsedZW4OHWTBb7gcKM2632W/vBF1KmNG2Fe3G/OVZ+JIVmUhfazyE2PqWN 0t3/UYQ26W5Rr+7Z9eb/8bgFNlCQ54NOfQXukp3AByLZhO+vhFuvWPG6tbIyj8REWzFP s/qA== MIME-Version: 1.0 Received: by 10.224.188.200 with SMTP id db8mr11002962qab.86.1350112921358; Sat, 13 Oct 2012 00:22:01 -0700 (PDT) Received: by 10.49.39.197 with HTTP; Sat, 13 Oct 2012 00:22:01 -0700 (PDT) In-Reply-To: References: Date: Sat, 13 Oct 2012 00:22:01 -0700 Message-ID: Subject: Re: Why they recommend this (CPU) ? From: Russell Jurney To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=20cf303347fb158fb404cbeba937 X-Virus-Checked: Checked by ClamAV on apache.org --20cf303347fb158fb404cbeba937 Content-Type: text/plain; charset=ISO-8859-1 Wow, thanks for an awesome reply, Steve! On Friday, October 12, 2012, Steve Loughran wrote: > > > On 11 October 2012 20:47, Goldstone, Robin J. > > wrote: > >> Be sure you are comparing apples to apples. The E5-2650 has a larger >> cache than the E5-2640, faster system bus and can support faster (1600Ghz >> vs 1333Ghz) DRAM resulting in greater potential memory bandwidth. >> >> http://ark.intel.com/compare/64590,64591 >> >> > mmm. There is more $L3, and in-CPU sync can be done better than over the > inter-socket bus -you're also less vulnerable to NUMA memory allocation > issues (*). > > There's another issue that drives these recommendations, namely the price > curve that server parts follow over time, the Bill-of-Materials curve, aka > the "BOM Curve". Most parts come in at one price, and that price drops over > time as a function of: volume parts shipped covering > Non-Recoverable-Engineering (NRE costs), improvements in yield and > manufacturing quality in that specific process, ...etc), until it levels > out a actual selling price (ASP) to the people who make the boxes (Original > Design Manufacturers==ODMs) where it tends to stay for the rest of that > part's lifespan. > > DRAM, HDDs follow a fairly predictable exponential decay curve. You can > look at the cost of a part, it's history, determine the variables and then > come up with a prediction of how much it will cost at a time in the near > future. It's these BOM curves that was key to Dell's business model -direct > sales to customer meant they didn't need so much inventory and could > actually get into a situation where they had the cash from the customer > before the ODM had built the box, let alone been paid for it. There was a > price: utter unpredictability of what DRAM and HDDs you were going to get. > Server-side things have stabilised and all the tier-1 PC vendors qualify a > set of DRAM and storage options, so they can source from multiple vendors, > so eliminating a single vendor as a SPOF and allowing them to negotiate > better on cost of parts -which again changes that BOM curve. > > This may seem strange but you should all know that the retail price of a > laptop, flatscreen TV, etc comes down over time -what's not so obvious are > the maths behind the changes in it's price. > > One of the odd parts in this business is the CPU. There is a near-monopoly > in supplies, and intel don't want their business at the flat bit of the > curve. They need the money not just to keep their shareholders happy, but > for the $B needed to build the next generation of Fabs and hence continue > to keep their shareholders happy in future. Intel parts come in high when > they initially ship, and stay at that price until the next time Intel > change their price list, which is usually quarterly. The first price change > is very steep, then the gradient d$/dT reduces, as it gets low enough that > part drops off the price list never to be seen again, except maybe in > embedded designs. > > What does that mean? It means you pay a lot for the top of the line x86 > CPUs, and unless you are 100% sure that you really need it, you may be > better off investing your money in: > -more DRAM with better ECCs (product placement: Chip-kill), buffering, : > less swapping, ability to run more reducers/node. > -more HDDs : more storage in same #of racks, assuming your site can take > the weight. > -SFF HDDs : less storage but more IO bandwidth off the disks. > -SSD: faster storage > -GPUs: very good performance for algorithms you can recompile onto them > -support from Hortonworks to can keep your Hadoop cluster going. > -10 GbE networking, or multiple bonded 1GbE > -more servers (this becomes more of a factor on larger clusters, where > the cost savings of the less expensive parts scale up) > -paying the electricity bill. > -keeping the cost of building up a hadoop cluster down, so making it more > affordable to store PB of data whose value will only appreciate over time. > -paying your ops team more money, keeping them happier and so increasing > the probability they will field the 4am support crisis. > > That's why it isn't clear cut that 8 cores are better. It's not just a > simple performance question -it's the opportunity cost of the price > difference scaled up by the number of nodes. You do -as Ted pointed out- > need to know what you actually want. > > Finally, as a basic "data science" exercise for the reader: > > 1. calculate the price curves of, say, a Dell laptop, and compare with the > price curve of an apple laptop introduced with the same CPU and at the same > time. Don't look at the absolute values -normalising them to a percentage > is better to view. > 2. Look at which one follows a soft gradient and which follows more of a > step function. > 3. add to the graph the intel pricing and see how that correlates with the > ASP. > 4. Determine from this which vendor has the best margins -not just at time > of release, but over the lifespan of a product. Integration is a useful > technique here. Bear in mind Apple's NRE costs on laptop are higher due to > the better HW design but also the software development is only funded from > their sales alone. > 5. Using this information, decide when is the best time to buy a dell or > an apple laptop. > > > I should make a blog post of this, "server prices: it's all down to the > exponential decay equations of the individual parts" > > Steve "why yes, I have spent time in the PC industry" Loughran > > > > (*) If you don't know what NUMA this is, do some research and think about > its implications in heap allocation. > > > >> >> From: Patrick Angeles > 'patrick@cloudera.com');>> >> Reply-To: "user@hadoop.apache.org > 'user@hadoop.apache.org');>" > 'cvml', 'user@hadoop.apache.org');>> >> Date: Thursday, October 11, 2012 12:36 PM >> To: "user@hadoop.apache.org > 'user@hadoop.apache.org');>" > 'cvml', 'user@hadoop.apache.org');>> >> Subject: Re: Why they recommend this (CPU) ? >> >> If you look at comparable Intel parts: >> >> Intel E5-2640 >> 6 cores @ 2.5 Ghz >> 95W - $885 >> >> Intel E5-2650 >> 8 cores @ 2.0 Ghz >> 95W - $1107 >> >> So, for $400 more on a dual proc system -- which really isn't much -- >> you get 2 more cores for a 20% drop in speed. I can believe that for some >> scenarios, the faster cores would fare better. Gzip compression is one that >> comes to mind, where you are aggressively trading CPU for lower storage >> volume and IO. An HBase cluster is another example. >> >> On Thu, Oct 11, 2012 at 3:03 PM, Russell Jurney >> > wrote: >> >>> My own clusters are too temporary and virtual for me to notice. I >>> haven't thought of clock speed as having mattered in a long time, so I'm >>> curious what kind of use cases might benefit from faster cores. Is there a >>> category in some way where this sweet spot for faster cores occurs? >>> >>> Russell Jurney http://datasyndrome.com >>> >>> On Oct 11, 2012, at 11:39 AM, Ted Dunning > >>> wrote: >>> >>> You should measure your workload. Your experience will vary >>> dramatically with different computations. >>> >>> On Thu, Oct 11, 2012 at 10:56 AM, Russell Jurney < >>> russell.jurney@gmail.com >> 'russell.jurney@gmail.com');>> wrote: >>> >>>> Anyone got data on this? This is interesting, and somewhat >>>> counter-intuitive. >>>> >>>> Russell Jurney http://datasyndrome.com >>>> >>>> On Oct 11, 2012, at 10:47 AM, Jay Vyas > >>>> wrote: >>>> >>>> > Presumably, if you have a reasonable number of cores - speeding the >>>> cores up will be better than forking a task into smaller and smaller chunks >>>> - because at some point the overhead of multiple processes would be a >>>> bottleneck - maybe due to streaming reads and writes? I'm sure each and >>>> every problem has a different sweet spot. >>>> >>> >>> >> > -- Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com --20cf303347fb158fb404cbeba937 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Wow, thanks for an awesome reply, Steve!

On Friday, Oct= ober 12, 2012, Steve Loughran wrote:

On 11 October 2012 20:47, Goldstone, Robin J= . <goldstone1@llnl.gov> wrote:
Be sure you are comparing apples to apples. =A0The E5-2650 has a large= r cache than the E5-2640, faster system bus and can support faster (1600Ghz= vs 1333Ghz) DRAM resulting in greater potential memory bandwidth.



mmm. There is more $L= 3, and in-CPU sync can be done better than over the inter-socket bus -you&#= 39;re also less vulnerable to NUMA memory allocation issues (*).=A0

There's another issue that drives these recommendat= ions, namely the price curve that server parts follow over time, the Bill-o= f-Materials curve, aka the "BOM Curve". Most parts come in at one= price, and that price drops over time as a function of: volume parts shipp= ed covering Non-Recoverable-Engineering (NRE costs), improvements in yield = and manufacturing quality in that specific process, ...etc), until it level= s out a actual selling price (ASP) to the people who make the boxes (Origin= al Design Manufacturers=3D=3DODMs) where it tends to stay for the rest of t= hat part's lifespan.

DRAM, HDDs follow a fairly predictable exponential deca= y curve. You can look at the cost of a part, it's history, determine th= e variables and then come up with a prediction of how much it will cost at = a time in the near future. It's these BOM curves that was key to Dell&#= 39;s business model -direct sales to customer meant they didn't need so= much inventory and could actually get into a situation where they had the = cash from the customer before the ODM had built the box, let alone been pai= d for it. There was a price: utter unpredictability of what DRAM and HDDs y= ou were going to get. Server-side things have stabilised and all the tier-1= PC vendors qualify a set of DRAM and storage options, so they can source f= rom multiple vendors, so eliminating a single vendor as a SPOF and allowing= them to negotiate better on cost of parts -which again changes that BOM cu= rve.

This may seem strange but you should all know that the = retail price of a laptop, flatscreen TV, etc comes down over time -what'= ;s not so obvious are the maths behind the changes in it's price.=A0

One of the odd parts in this business is the CPU. There= is a near-monopoly in supplies, and intel don't want their business at= the flat bit of the curve. They need the money not just to keep their shar= eholders happy, but for the $B needed to build the next generation of Fabs = and hence continue to keep their shareholders happy in future. Intel parts = come in high when they initially ship, and stay at that price until the nex= t time Intel change their price list, which is usually quarterly. The first= price change is very steep, then the gradient d$/dT reduces, as it gets lo= w enough that part drops off the price list never to be seen again, except = maybe in embedded designs.=A0

What does that mean? It means you pay a lot for the top= of the line x86 CPUs, and unless you are 100% sure that you really need it= , you may be better off investing your money in:
=A0-more DRAM wi= th better ECCs (product placement: Chip-kill), buffering, : less swapping, = ability to run more reducers/node.
=A0-more HDDs : more storage in same #of racks, assuming your site can= take the weight.
=A0-SFF HDDs : less storage but more IO bandwid= th off the disks.
=A0-SSD: faster storage
=A0-GPUs: ver= y good performance for algorithms you can recompile onto them
=A0-support from Hortonworks to can keep your=A0Hadoop cluster going.<= /div>
=A0-10 GbE networking, or multiple bonded 1GbE
=A0-more= servers (this becomes more of a factor on larger clusters, where the cost = savings of the less expensive parts scale up)
=A0-paying the electricity bill.
=A0-keeping the cost of bui= lding up a hadoop cluster down, so making it more affordable to store PB of= data whose value will only appreciate over time.
=A0-paying your= ops team more money, keeping them happier and so increasing the probabilit= y they will field the 4am support crisis.

That's why it isn't clear cut that 8 cores are = better. It's not just a simple performance question -it's the oppor= tunity cost of the price difference scaled up by the number of nodes. You d= o -as Ted pointed out- need to know what you actually want.

Finally, as a basic "data science" exercise f= or the reader:=A0

1. calculate the price curves of= , say, a Dell laptop, and compare with the price curve of an apple laptop i= ntroduced with the same CPU and at the same time. Don't look at the abs= olute values -normalising them to a percentage is better to view.
2. Look at which one follows a soft gradient and which follows more of= a step function.=A0
3. add to the graph the intel pricing and se= e how that correlates with the ASP.
4. Determine from this which = vendor has the best margins -not just at time of release, but over the life= span of a product. Integration is a useful technique here. Bear in mind App= le's NRE costs on laptop are higher due to the better HW design but als= o the software development is only funded from their sales alone.
5. Using this information, decide when is the best time to buy a dell = or an apple laptop.


I should make a= blog post of this, "server prices: it's all down to the exponenti= al decay equations of the individual parts"

Steve "why yes, I have spent time in the PC indust= ry" Loughran



(*= ) If you don't know what NUMA this is, do some research and think about= its implications in heap allocation.

=A0

From: Patrick Angeles <patrick@cloudera.com>
Reply-To: "user@hadoop.apache.org" <user@hado= op.apache.org>
Date: Thursday, October 11, 2012 12= :36 PM
To: "u= ser@hadoop.apache.org" <user@hadoop.apa= che.org>
Subject: Re: Why they recommend thi= s (CPU) ?

If you look at comparable Intel parts:

Intel E5-2640
6 cores @ 2.5 Ghz
95W - $885

Intel E5-2650
8 cores @ 2.0 Ghz
95W -=A0$1107

So, for $400 more on a dual proc system -- which really isn't much= -- you get 2 more cores for a 20% drop in speed. I can believe that for so= me scenarios, the faster cores would fare better. Gzip compression is one t= hat comes to mind, where you are aggressively trading CPU for lower storage volume and IO. An HBase cluster is another e= xample.

On Thu, Oct 11, 2012 at 3:03 PM, Russell Jurney = <russell.jurney@gmail.com> wrote= :
My own clusters are too temporary and virtual for me to notice. I have= n't thought of clock speed as having mattered in a long time, so I'= m curious what kind of use cases might benefit from faster cores. Is there = a category in some way where this sweet spot for faster cores occurs?

Russell Jurney http:/= /datasyndrome.com

On Oct 11, 2012, at 11:39 AM, Ted Dunning <tdunni= ng@maprtech.com> wrote:

You should measure your workload. =A0Your experience will vary dramati= cally with different computations.

On Thu, Oct 11, 2012 at 10:56 AM, Russell Jurney= <russell.jurney@gmail.com> wrote= :
Anyone got data on this? This is interesting, and somewhat counter-intuitiv= e.

Russell Jurney http:/= /datasyndrome.com

On Oct 11, 2012, at 10:47 AM, Jay Vyas <jayunit100= @gmail.com> wrote:

> Presumably, if you have a reasonable number of cores - speeding the co= res up will be better than forking a task into smaller and smaller chunks -= because at some point the overhead of multiple processes would be a bottle= neck - maybe due to streaming reads and writes? =A0I'm sure each and every problem has a different sweet s= pot.





--
Russell Jurney=A0twitter.com/rjurney=A0russell.jurney@gmail.com= =A0datasyndrome.com
--20cf303347fb158fb404cbeba937--