Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of russell.jurney@gmail.com
 designates 209.85.216.41 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CA+4kjVuC-7fm_p67SNtCjMUgHis6dMbeOUKSbCUJPYCQAAh3hA@mail.gmail.com>
References: 
 <CANS822icmNqzXzBu3sWSGX34vKz0WSP58zW21EcwJ9YycMTo_A@mail.gmail.com>
	<CC9C6FC3.16A03%goldstone1@llnl.gov>
	<CA+4kjVuC-7fm_p67SNtCjMUgHis6dMbeOUKSbCUJPYCQAAh3hA@mail.gmail.com>
Date: Sat, 13 Oct 2012 00:22:01 -0700
Message-ID: 
 <CANSvDjpomCSYHgw0zjcZYjuP4F5oJc7=Da8di73vGQhmow1zTg@mail.gmail.com>
Subject: Re: Why they recommend this (CPU) ?
From: Russell Jurney <russell.jurney@gmail.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Content-Type: multipart/alternative; boundary=20cf303347fb158fb404cbeba937

--20cf303347fb158fb404cbeba937
Content-Type: text/plain; charset=ISO-8859-1

Wow, thanks for an awesome reply, Steve!

On Friday, October 12, 2012, Steve Loughran wrote:

>
>
> On 11 October 2012 20:47, Goldstone, Robin J. <goldstone1@llnl.gov<javascript:_e({}, 'cvml', 'goldstone1@llnl.gov');>
> > wrote:
>
>>  Be sure you are comparing apples to apples.  The E5-2650 has a larger
>> cache than the E5-2640, faster system bus and can support faster (1600Ghz
>> vs 1333Ghz) DRAM resulting in greater potential memory bandwidth.
>>
>>  http://ark.intel.com/compare/64590,64591
>>
>>
> mmm. There is more $L3, and in-CPU sync can be done better than over the
> inter-socket bus -you're also less vulnerable to NUMA memory allocation
> issues (*).
>
> There's another issue that drives these recommendations, namely the price
> curve that server parts follow over time, the Bill-of-Materials curve, aka
> the "BOM Curve". Most parts come in at one price, and that price drops over
> time as a function of: volume parts shipped covering
> Non-Recoverable-Engineering (NRE costs), improvements in yield and
> manufacturing quality in that specific process, ...etc), until it levels
> out a actual selling price (ASP) to the people who make the boxes (Original
> Design Manufacturers==ODMs) where it tends to stay for the rest of that
> part's lifespan.
>
> DRAM, HDDs follow a fairly predictable exponential decay curve. You can
> look at the cost of a part, it's history, determine the variables and then
> come up with a prediction of how much it will cost at a time in the near
> future. It's these BOM curves that was key to Dell's business model -direct
> sales to customer meant they didn't need so much inventory and could
> actually get into a situation where they had the cash from the customer
> before the ODM had built the box, let alone been paid for it. There was a
> price: utter unpredictability of what DRAM and HDDs you were going to get.
> Server-side things have stabilised and all the tier-1 PC vendors qualify a
> set of DRAM and storage options, so they can source from multiple vendors,
> so eliminating a single vendor as a SPOF and allowing them to negotiate
> better on cost of parts -which again changes that BOM curve.
>
> This may seem strange but you should all know that the retail price of a
> laptop, flatscreen TV, etc comes down over time -what's not so obvious are
> the maths behind the changes in it's price.
>
> One of the odd parts in this business is the CPU. There is a near-monopoly
> in supplies, and intel don't want their business at the flat bit of the
> curve. They need the money not just to keep their shareholders happy, but
> for the $B needed to build the next generation of Fabs and hence continue
> to keep their shareholders happy in future. Intel parts come in high when
> they initially ship, and stay at that price until the next time Intel
> change their price list, which is usually quarterly. The first price change
> is very steep, then the gradient d$/dT reduces, as it gets low enough that
> part drops off the price list never to be seen again, except maybe in
> embedded designs.
>
> What does that mean? It means you pay a lot for the top of the line x86
> CPUs, and unless you are 100% sure that you really need it, you may be
> better off investing your money in:
>  -more DRAM with better ECCs (product placement: Chip-kill), buffering, :
> less swapping, ability to run more reducers/node.
>  -more HDDs : more storage in same #of racks, assuming your site can take
> the weight.
>  -SFF HDDs : less storage but more IO bandwidth off the disks.
>  -SSD: faster storage
>  -GPUs: very good performance for algorithms you can recompile onto them
>  -support from Hortonworks to can keep your Hadoop cluster going.
>  -10 GbE networking, or multiple bonded 1GbE
>  -more servers (this becomes more of a factor on larger clusters, where
> the cost savings of the less expensive parts scale up)
>  -paying the electricity bill.
>  -keeping the cost of building up a hadoop cluster down, so making it more
> affordable to store PB of data whose value will only appreciate over time.
>  -paying your ops team more money, keeping them happier and so increasing
> the probability they will field the 4am support crisis.
>
> That's why it isn't clear cut that 8 cores are better. It's not just a
> simple performance question -it's the opportunity cost of the price
> difference scaled up by the number of nodes. You do -as Ted pointed out-
> need to know what you actually want.
>
> Finally, as a basic "data science" exercise for the reader:
>
> 1. calculate the price curves of, say, a Dell laptop, and compare with the
> price curve of an apple laptop introduced with the same CPU and at the same
> time. Don't look at the absolute values -normalising them to a percentage
> is better to view.
> 2. Look at which one follows a soft gradient and which follows more of a
> step function.
> 3. add to the graph the intel pricing and see how that correlates with the
> ASP.
> 4. Determine from this which vendor has the best margins -not just at time
> of release, but over the lifespan of a product. Integration is a useful
> technique here. Bear in mind Apple's NRE costs on laptop are higher due to
> the better HW design but also the software development is only funded from
> their sales alone.
> 5. Using this information, decide when is the best time to buy a dell or
> an apple laptop.
>
>
> I should make a blog post of this, "server prices: it's all down to the
> exponential decay equations of the individual parts"
>
> Steve "why yes, I have spent time in the PC industry" Loughran
>
>
>
> (*) If you don't know what NUMA this is, do some research and think about
> its implications in heap allocation.
>
>
>
>>
>>   From: Patrick Angeles <patrick@cloudera.com <javascript:_e({}, 'cvml',
>> 'patrick@cloudera.com');>>
>> Reply-To: "user@hadoop.apache.org <javascript:_e({}, 'cvml',
>> 'user@hadoop.apache.org');>" <user@hadoop.apache.org <javascript:_e({},
>> 'cvml', 'user@hadoop.apache.org');>>
>> Date: Thursday, October 11, 2012 12:36 PM
>> To: "user@hadoop.apache.org <javascript:_e({}, 'cvml',
>> 'user@hadoop.apache.org');>" <user@hadoop.apache.org <javascript:_e({},
>> 'cvml', 'user@hadoop.apache.org');>>
>> Subject: Re: Why they recommend this (CPU) ?
>>
>>   If you look at comparable Intel parts:
>>
>>  Intel E5-2640
>> 6 cores @ 2.5 Ghz
>> 95W - $885
>>
>>  Intel E5-2650
>> 8 cores @ 2.0 Ghz
>> 95W - $1107
>>
>>  So, for $400 more on a dual proc system -- which really isn't much --
>> you get 2 more cores for a 20% drop in speed. I can believe that for some
>> scenarios, the faster cores would fare better. Gzip compression is one that
>> comes to mind, where you are aggressively trading CPU for lower storage
>> volume and IO. An HBase cluster is another example.
>>
>> On Thu, Oct 11, 2012 at 3:03 PM, Russell Jurney <russell.jurney@gmail.com<javascript:_e({}, 'cvml', 'russell.jurney@gmail.com');>
>> > wrote:
>>
>>>  My own clusters are too temporary and virtual for me to notice. I
>>> haven't thought of clock speed as having mattered in a long time, so I'm
>>> curious what kind of use cases might benefit from faster cores. Is there a
>>> category in some way where this sweet spot for faster cores occurs?
>>>
>>> Russell Jurney http://datasyndrome.com
>>>
>>> On Oct 11, 2012, at 11:39 AM, Ted Dunning <tdunning@maprtech.com<javascript:_e({}, 'cvml', 'tdunning@maprtech.com');>>
>>> wrote:
>>>
>>>   You should measure your workload.  Your experience will vary
>>> dramatically with different computations.
>>>
>>> On Thu, Oct 11, 2012 at 10:56 AM, Russell Jurney <
>>> russell.jurney@gmail.com <javascript:_e({}, 'cvml',
>>> 'russell.jurney@gmail.com');>> wrote:
>>>
>>>> Anyone got data on this? This is interesting, and somewhat
>>>> counter-intuitive.
>>>>
>>>> Russell Jurney http://datasyndrome.com
>>>>
>>>> On Oct 11, 2012, at 10:47 AM, Jay Vyas <jayunit100@gmail.com<javascript:_e({}, 'cvml', 'jayunit100@gmail.com');>>
>>>> wrote:
>>>>
>>>> > Presumably, if you have a reasonable number of cores - speeding the
>>>> cores up will be better than forking a task into smaller and smaller chunks
>>>> - because at some point the overhead of multiple processes would be a
>>>> bottleneck - maybe due to streaming reads and writes?  I'm sure each and
>>>> every problem has a different sweet spot.
>>>>
>>>
>>>
>>
>

-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

--20cf303347fb158fb404cbeba937
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Wow, thanks for an awesome reply, Steve!<span></span><br><br>On Friday, Oct=
ober 12, 2012, Steve Loughran  wrote:<br><blockquote class=3D"gmail_quote" =
style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br=
>
<br><div class=3D"gmail_quote">On 11 October 2012 20:47, Goldstone, Robin J=
. <span dir=3D"ltr">&lt;<a href=3D"javascript:_e({}, &#39;cvml&#39;, &#39;g=
oldstone1@llnl.gov&#39;);" target=3D"_blank">goldstone1@llnl.gov</a>&gt;</s=
pan> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">


<div style=3D"font-size:14px;font-family:Calibri,sans-serif;word-wrap:break=
-word">
<div>Be sure you are comparing apples to apples. =A0The E5-2650 has a large=
r cache than the E5-2640, faster system bus and can support faster (1600Ghz=
 vs 1333Ghz) DRAM resulting in greater potential memory bandwidth.</div>


<div><br>
</div>
<div><a href=3D"http://ark.intel.com/compare/64590,64591" target=3D"_blank"=
>http://ark.intel.com/compare/64590,64591</a></div>
<div><br></div></div></blockquote><div><br></div><div>mmm. There is more $L=
3, and in-CPU sync can be done better than over the inter-socket bus -you&#=
39;re also less vulnerable to NUMA memory allocation issues (*).=A0</div>

<div><br></div><div>There&#39;s another issue that drives these recommendat=
ions, namely the price curve that server parts follow over time, the Bill-o=
f-Materials curve, aka the &quot;BOM Curve&quot;. Most parts come in at one=
 price, and that price drops over time as a function of: volume parts shipp=
ed covering Non-Recoverable-Engineering (NRE costs), improvements in yield =
and manufacturing quality in that specific process, ...etc), until it level=
s out a actual selling price (ASP) to the people who make the boxes (Origin=
al Design Manufacturers=3D=3DODMs) where it tends to stay for the rest of t=
hat part&#39;s lifespan.</div>

<div><br></div><div>DRAM, HDDs follow a fairly predictable exponential deca=
y curve. You can look at the cost of a part, it&#39;s history, determine th=
e variables and then come up with a prediction of how much it will cost at =
a time in the near future. It&#39;s these BOM curves that was key to Dell&#=
39;s business model -direct sales to customer meant they didn&#39;t need so=
 much inventory and could actually get into a situation where they had the =
cash from the customer before the ODM had built the box, let alone been pai=
d for it. There was a price: utter unpredictability of what DRAM and HDDs y=
ou were going to get. Server-side things have stabilised and all the tier-1=
 PC vendors qualify a set of DRAM and storage options, so they can source f=
rom multiple vendors, so eliminating a single vendor as a SPOF and allowing=
 them to negotiate better on cost of parts -which again changes that BOM cu=
rve.</div>

<div><br></div><div>This may seem strange but you should all know that the =
retail price of a laptop, flatscreen TV, etc comes down over time -what&#39=
;s not so obvious are the maths behind the changes in it&#39;s price.=A0</d=
iv>

<div><br></div><div>One of the odd parts in this business is the CPU. There=
 is a near-monopoly in supplies, and intel don&#39;t want their business at=
 the flat bit of the curve. They need the money not just to keep their shar=
eholders happy, but for the $B needed to build the next generation of Fabs =
and hence continue to keep their shareholders happy in future. Intel parts =
come in high when they initially ship, and stay at that price until the nex=
t time Intel change their price list, which is usually quarterly. The first=
 price change is very steep, then the gradient d$/dT reduces, as it gets lo=
w enough that part drops off the price list never to be seen again, except =
maybe in embedded designs.=A0</div>

<div><br></div><div>What does that mean? It means you pay a lot for the top=
 of the line x86 CPUs, and unless you are 100% sure that you really need it=
, you may be better off investing your money in:</div><div>=A0-more DRAM wi=
th better ECCs (product placement: Chip-kill), buffering, : less swapping, =
ability to run more reducers/node.</div>

<div>=A0-more HDDs : more storage in same #of racks, assuming your site can=
 take the weight.</div><div>=A0-SFF HDDs : less storage but more IO bandwid=
th off the disks.</div><div>=A0-SSD: faster storage</div><div>=A0-GPUs: ver=
y good performance for algorithms you can recompile onto them</div>

<div>=A0-support from Hortonworks to can keep your=A0Hadoop cluster going.<=
/div><div>=A0-10 GbE networking, or multiple bonded 1GbE</div><div>=A0-more=
 servers (this becomes more of a factor on larger clusters, where the cost =
savings of the less expensive parts scale up)</div>

<div>=A0-paying the electricity bill.</div><div>=A0-keeping the cost of bui=
lding up a hadoop cluster down, so making it more affordable to store PB of=
 data whose value will only appreciate over time.</div><div>=A0-paying your=
 ops team more money, keeping them happier and so increasing the probabilit=
y they will field the 4am support crisis.</div>

<div><br></div><div>That&#39;s why it isn&#39;t clear cut that 8 cores are =
better. It&#39;s not just a simple performance question -it&#39;s the oppor=
tunity cost of the price difference scaled up by the number of nodes. You d=
o -as Ted pointed out- need to know what you actually want.</div>

<div><br></div><div>Finally, as a basic &quot;data science&quot; exercise f=
or the reader:=A0</div><div><br></div><div>1. calculate the price curves of=
, say, a Dell laptop, and compare with the price curve of an apple laptop i=
ntroduced with the same CPU and at the same time. Don&#39;t look at the abs=
olute values -normalising them to a percentage is better to view.</div>

<div>2. Look at which one follows a soft gradient and which follows more of=
 a step function.=A0</div><div>3. add to the graph the intel pricing and se=
e how that correlates with the ASP.</div><div>4. Determine from this which =
vendor has the best margins -not just at time of release, but over the life=
span of a product. Integration is a useful technique here. Bear in mind App=
le&#39;s NRE costs on laptop are higher due to the better HW design but als=
o the software development is only funded from their sales alone.</div>

<div>5. Using this information, decide when is the best time to buy a dell =
or an apple laptop.</div><div><br></div><div><br></div><div>I should make a=
 blog post of this, &quot;server prices: it&#39;s all down to the exponenti=
al decay equations of the individual parts&quot;</div>

<div><br></div><div>Steve &quot;why yes, I have spent time in the PC indust=
ry&quot; Loughran</div><div><br></div><div><br></div><div><br></div><div>(*=
) If you don&#39;t know what NUMA this is, do some research and think about=
 its implications in heap allocation.</div>

<div><br></div><div>=A0</div><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style=3D"f=
ont-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word"><div>
</div>
<div><br>
</div>
<span>
<div style=3D"border-right:medium none;padding-right:0in;padding-left:0in;p=
adding-top:3pt;text-align:left;font-size:11pt;border-bottom:medium none;fon=
t-family:Calibri;border-top:#b5c4df 1pt solid;padding-bottom:0in;border-lef=
t:medium none">


<span style=3D"font-weight:bold">From: </span>Patrick Angeles &lt;<a href=
=3D"javascript:_e({}, &#39;cvml&#39;, &#39;patrick@cloudera.com&#39;);" tar=
get=3D"_blank">patrick@cloudera.com</a>&gt;<br>
<span style=3D"font-weight:bold">Reply-To: </span>&quot;<a href=3D"javascri=
pt:_e({}, &#39;cvml&#39;, &#39;user@hadoop.apache.org&#39;);" target=3D"_bl=
ank">user@hadoop.apache.org</a>&quot; &lt;<a href=3D"javascript:_e({}, &#39=
;cvml&#39;, &#39;user@hadoop.apache.org&#39;);" target=3D"_blank">user@hado=
op.apache.org</a>&gt;<br>


<span style=3D"font-weight:bold">Date: </span>Thursday, October 11, 2012 12=
:36 PM<br>
<span style=3D"font-weight:bold">To: </span>&quot;<a href=3D"javascript:_e(=
{}, &#39;cvml&#39;, &#39;user@hadoop.apache.org&#39;);" target=3D"_blank">u=
ser@hadoop.apache.org</a>&quot; &lt;<a href=3D"javascript:_e({}, &#39;cvml&=
#39;, &#39;user@hadoop.apache.org&#39;);" target=3D"_blank">user@hadoop.apa=
che.org</a>&gt;<br>


<span style=3D"font-weight:bold">Subject: </span>Re: Why they recommend thi=
s (CPU) ?<br>
</div><div><div>
<div><br>
</div>
<div>
<div>
<div>If you look at comparable Intel parts:</div>
<div><br>
</div>
<div>Intel E5-2640</div>
<div>6 cores @ 2.5 Ghz</div>
<div>95W - $885</div>
<div><br>
</div>
<div>Intel E5-2650</div>
<div>8 cores @ 2.0 Ghz</div>
<div>95W -=A0$1107</div>
<div><br>
</div>
<div>So, for $400 more on a dual proc system -- which really isn&#39;t much=
 -- you get 2 more cores for a 20% drop in speed. I can believe that for so=
me scenarios, the faster cores would fare better. Gzip compression is one t=
hat comes to mind, where you are aggressively
 trading CPU for lower storage volume and IO. An HBase cluster is another e=
xample.</div>
<div><br>
<div class=3D"gmail_quote">On Thu, Oct 11, 2012 at 3:03 PM, Russell Jurney =
<span dir=3D"ltr">
&lt;<a href=3D"javascript:_e({}, &#39;cvml&#39;, &#39;russell.jurney@gmail.=
com&#39;);" target=3D"_blank">russell.jurney@gmail.com</a>&gt;</span> wrote=
:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div bgcolor=3D"#FFFFFF">
<div>My own clusters are too temporary and virtual for me to notice. I have=
n&#39;t thought of clock speed as having mattered in a long time, so I&#39;=
m curious what kind of use cases might benefit from faster cores. Is there =
a category in some way where this sweet
 spot for faster cores occurs?<br>
<br>
Russell Jurney <a href=3D"http://datasyndrome.com" target=3D"_blank">http:/=
/datasyndrome.com</a></div>
<div>
<div>
<div><br>
On Oct 11, 2012, at 11:39 AM, Ted Dunning &lt;<a href=3D"javascript:_e({}, =
&#39;cvml&#39;, &#39;tdunning@maprtech.com&#39;);" target=3D"_blank">tdunni=
ng@maprtech.com</a>&gt; wrote:<br>
<br>
</div>
<div></div>
<blockquote type=3D"cite">
<div>You should measure your workload. =A0Your experience will vary dramati=
cally with different computations.<br>
<br>
<div class=3D"gmail_quote">On Thu, Oct 11, 2012 at 10:56 AM, Russell Jurney=
 <span dir=3D"ltr">
&lt;<a href=3D"javascript:_e({}, &#39;cvml&#39;, &#39;russell.jurney@gmail.=
com&#39;);" target=3D"_blank">russell.jurney@gmail.com</a>&gt;</span> wrote=
:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
Anyone got data on this? This is interesting, and somewhat counter-intuitiv=
e.<br>
<br>
Russell Jurney <a href=3D"http://datasyndrome.com" target=3D"_blank">http:/=
/datasyndrome.com</a><br>
<div>
<div><br>
On Oct 11, 2012, at 10:47 AM, Jay Vyas &lt;<a href=3D"javascript:_e({}, =
9;cvml&#39;, &#39;jayunit100@gmail.com&#39;);" target=3D"_blank">jayunit100=
@gmail.com</a>&gt; wrote:<br>
<br>
&gt; Presumably, if you have a reasonable number of cores - speeding the co=
res up will be better than forking a task into smaller and smaller chunks -=
 because at some point the overhead of multiple processes would be a bottle=
neck - maybe due to streaming reads
 and writes? =A0I&#39;m sure each and every problem has a different sweet s=
pot.<br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div></div></span>
</div>

</blockquote></div><br>
</blockquote><br><br>-- <br><span style=3D"font-family:arial,sans-serif;fon=
t-size:14px">Russell Jurney=A0</span><a href=3D"http://twitter.com/rjurney"=
 style=3D"font-family:arial,sans-serif;font-size:14px;color:rgb(0,0,204)" t=
arget=3D"_blank">twitter.<span style=3D"background-color:rgb(255,255,136);c=
olor:rgb(34,34,34)">com</span>/rjurney</a><span style=3D"font-family:arial,=
sans-serif;font-size:14px">=A0</span><font color=3D"#888888" style=3D"font-=
family:arial,sans-serif;font-size:14px"><a href=3D"mailto:russell.jurney@gm=
ail.com" style=3D"color:rgb(0,0,204)" target=3D"_blank"><font color=3D"#000=
0cc" style=3D"color:rgb(0,0,204)">russell.jurney@gmail.</font><span style=
=3D"background-color:rgb(255,255,136);color:rgb(34,34,34)">com</span></a>=
=A0<a href=3D"http://datasyndrome.com/" style=3D"color:rgb(0,0,204)" target=
=3D"_blank"><span style=3D"background-color:rgb(255,255,136);color:rgb(34,3=
4,34)">datasyndrome</span>.<span style=3D"background-color:rgb(255,255,136)=
;color:rgb(34,34,34)">com</span></a></font><br>

--20cf303347fb158fb404cbeba937--