hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michel Segel <michael_se...@hotmail.com>
Subject Re: Hadoop cluster hardware details for big data
Date Wed, 06 Jul 2011 12:18:51 GMT
Wasn't the answer 42?  ;-P

Looking at your calc...
You forgot to factor in the number of slots per node.
So the number is only a fraction. Assume 10 slots per node. (10 because it makes the math
easier.)

Then you need only 300 machines. You could then name your cluster lambda. (another literary
reference...)

300 machines is a manageable cluster.

I agree that the initial question is vague and the only true answer is 'it depends...'
But if they want to build out a cluster of 300 machines... I've gotta guy... :-)



Sent from a remote device. Please excuse any typos...

Mike Segel

On Jul 6, 2011, at 6:32 AM, Steve Loughran <stevel@apache.org> wrote:

> On 06/07/11 11:43, Karthik Kumar wrote:
>> Hi,
>> 
>> Has anyone here used hadoop to process more than 3TB of data? If so we
>> would like to know how many machines you used in your cluster and
>> about the hardware configuration. The objective is to know how to
>> handle huge data in Hadoop cluster.
>> 
> 
> Actually, I've just thought of  simpler answer. 40. It's completely random, but if said
with confidence it's as valid as any other answer to your current question.
> 

Mime
View raw message