hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ratner, Alan S (IS)" <Alan.Rat...@ngc.com>
Subject Re: Making optimum use of cores
Date Wed, 15 Sep 2010 17:38:22 GMT
Thanks for the quick responses.  I raised the 2 parameters to 14 (figuring there might be other
apps running - like Zookeeper - that might want some cores of their own).  This has made a
qualitative difference - the System Monitor now shows much higher squiggly lines indicating
better distribution of the job to the various cores.  However, the quantitative difference
is insignificant - my job runs about 4% faster.  I hope I don't have to migrate to use the
C++ API from Java.


-----Original Message-----
From: Mohamed Riadh Trad [mailto:Mohamed.trad@inria.fr] 
Sent: Wednesday, September 15, 2010 10:24 AM
To: Christopher.Shain@sungard.com; mapreduce-user@hadoop.apache.org
Subject: EXTERNAL:Re: Making optimum use of cores

Hi Christopher,

I ve been  Working @Sungard(Global Trading), I left 2 yeas ago... Hope you enjoy working in

When it comes to performance, you should rather use the C++ API. By fixing the maps slots
per node to Virtual Cpus number per Node,  u can fully parallelize jobs.. and use 16000% of
the Nehalem CPU.


Le 15 sept. 2010 à 16:00, <Christopher.Shain@sungard.com> <Christopher.Shain@sungard.com>
a écrit :

> It seems likely that you are only running one (single-threaded) map or reduce operation
per worker node. Do you know whether you are in fact running multiple operations?
> This also sounds like it may be a manifestation of a question that I have seen a lot
on the mailing lists lately, which is that people do not know how to increase the number of
task slots in their tasktracker configuration.  This setting is normally controlled via the
setting mapred.tasktracker.{map|reduce}.tasks.maximum in mapred-site.xml.  The default of
2 each is probably too low for your servers.
> ----- Original Message -----
> From: Ratner, Alan S (IS) <Alan.Ratner@ngc.com>
> To: mapreduce-user@hadoop.apache.org <mapreduce-user@hadoop.apache.org>
> Sent: Wed Sep 15 09:47:47 2010
> Subject: Making optimum use of cores
> I'm running Hadoop 0.20.2 on a cluster of servers running Ubuntu 10.4.
> Each server has 2 quad-core Nehalem CPUs for a total of 8 physical cores
> running as 16 virtual cores.  Ubuntu's System Monitor displays 16
> squiggly lines showing usage of the 16 virtual cores.  We only seem to
> be making use of one of the 16 virtual cores on any slave node and even
> on the master node only one virtual core is significantly busy at a
> time.  Is there a way to make better use of the cores?  Presumably I
> could run Hadoop in a VM assigned to each virtual core but I would think
> there must be a more elegant solution.
> Alan Ratner

View raw message