hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: More cores Vs More Nodes ?
Date Thu, 15 Dec 2011 20:57:50 GMT


I've said this before and I'm going to say it again.

Your knowledge of Hadoop is purely academic. It may be ok to talk to C level execs who visit
the San Jose IM Lab or in Markham, but when you give answers on issues you don't have first
hand practical experience, you end up doing more harm than good.

The problem is that too many people blindly except what they see on the web as fact when its
not always accurate and may not suit their needs.
I've lost count on the number of hours I've spent in meetings trying to undo the damage cause
by someone saying "... but FB does it this way...therefore that's how we should do it."

Now Michael St.Ack is a pretty smart guy. He knows his shit. He's extremely credible. However
when he says that FB does something a specific way, that is because FB has certain requirements
and the solution works for them. It doesn't mean that it will be the best solution for your

And Tom, if we pull out your business card, you have a nice fancy title with IBM. So you instantly
have some credibility. Unfortunately, you're no St.Ack.  (I'd put a smile face but I'm actually
trying to be serious.)

Even in this post, you continue to go down the wrong path. 
Unfortunately I don't have time to lecture you on why what you said is wrong and that your
thoughts on cluster design are way off base. 
Oh and I tease you because frankly, you deserve it. 

I have to apologize to everyone on the list, but in the past, you failed to actually stop
and take the hint that maybe you need to rethink your views on Hadoop.  That had you had practical
experience setting up actual clusters (Not EC2 clusters) you would have the necessary understanding
of what can go wrong and how to fix it. 

If I get time, I'll have to find my copy of "Up Front" by Bill Maudlin. There's a cartoon
that really fits you.


> To: common-user@hadoop.apache.org
> Subject: RE: More cores Vs More Nodes ?
> From: tdeutsch@us.ibm.com
> Date: Wed, 14 Dec 2011 11:40:51 -0800
> Your eagerness to insult is throwing you off track here Michael. 
> For example, the workload profile of a cluster doing heavy NLP is very 
> different than one doing serving as a destination for large scale 
> application/web logs. Ditto for P&C risk modeling vs smart meter use 
> cases, etc etc...Those are not general purpose clusters. You may - and 
> should I'd say - have the NLP use cases in a common analytics environment 
> (internal cloud model) for sharing of methods/skills, but putting 
> orthogonal use cases on that cluster is not inherently a best practice.
> How those clusters should be built does vary, and no it is not uncommon to 
> have focused use cases like that. If you know it is going to be a general 
> purpose cluster then do build it in a balanced spec. 
> ------------------------------------------------
> Tom Deutsch
> Program Director
> Information Management
> Big Data Technologies
> 3565 Harbor Blvd
> Costa Mesa, CA 92626-1420
> tdeutsch@us.ibm.com
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message