hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@apache.org>
Subject Re: Hadoop cluster optimization
Date Mon, 22 Aug 2011 04:05:37 GMT

On Aug 21, 2011, at 7:17 PM, Michel Segel wrote:

> Avi,
> First why 32 bit OS?
> You have a 64 bit processor that has 4 cores hyper threaded looks like 8cpus.

	With only 1.7gb of mem, there likely isn't much of a reason to use a 64-bit OS.  The machines
(as you point out) are already tight on memory.  64-bit is only going to make it worse.

>> 1.7 GB memory
>> 1 Intel(R) Xeon(R) CPU E5507 @ 2.27GHz
>> Ubuntu Server 10.10 , 32-bit platform
>> Cloudera CDH3 Manual Hadoop Installation
>> (for the ones who are familiar with Amazon Web Services, I am talking about
>> Small EC2 Instances/Servers)
>> Total job run time is +-15 minutes (+-50 files/blocks/mapTasks of up to 250
>> MB and 10 reduce tasks).
>> Based on the above information, does anyone can recommend on a best practice
>> configuration??

	How many spindles?  Are your tasks spilling?

>> Do you thinks that when dealing with such a small cluster, and when
>> processing such a small amount of data,
>> is it even possible to optimize jobs so they would run much faster? 

	Most of the time, performance issues are with the algorithm, not Hadoop.

View raw message