hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michel Segel <michael_se...@hotmail.com>
Subject Re: Hadoop cluster hardware details for big data
Date Wed, 06 Jul 2011 12:18:51 GMT
Wasn't the answer 42?  ;-P

Looking at your calc...
You forgot to factor in the number of slots per node.
So the number is only a fraction. Assume 10 slots per node. (10 because it makes the math

Then you need only 300 machines. You could then name your cluster lambda. (another literary

300 machines is a manageable cluster.

I agree that the initial question is vague and the only true answer is 'it depends...'
But if they want to build out a cluster of 300 machines... I've gotta guy... :-)

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jul 6, 2011, at 6:32 AM, Steve Loughran <stevel@apache.org> wrote:

> On 06/07/11 11:43, Karthik Kumar wrote:
>> Hi,
>> Has anyone here used hadoop to process more than 3TB of data? If so we
>> would like to know how many machines you used in your cluster and
>> about the hardware configuration. The objective is to know how to
>> handle huge data in Hadoop cluster.
> Actually, I've just thought of  simpler answer. 40. It's completely random, but if said
with confidence it's as valid as any other answer to your current question.

View raw message