hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: How does hadoop decide how many reducers to run?
Date Fri, 11 Jan 2013 23:20:27 GMT

First, not enough information. 

1) EC2 got it. 
2) Which flavor of Hadoop? Is this EMR as well? 
3) How many slots did you configure in your mapred-site.xml? 

AWS EC2 cores aren't going to be hyperthreaded cores so 8 cores means that you will probably
have 6 cores for slots. 
With 16 reducers it sounds like you have 4 mappers and 4 reducers or 8 slots set up. (Over
subscription is ok if you're not running HBase) 

So what are you missing? 

On Jan 11, 2013, at 4:59 PM, Roy Smith <roy@panix.com> wrote:

> I ran a big job the other day on a cluster of 4 m2.4xlarge EC2 instances.  Each instance
is 8 cores, so 32 cores total.  Hadoop ran 16 reducers, followed by a second wave of 12. 
It seems to me it was only using half the available cores.  Is this normal?  Is there some
way to force it to use all the cores?
> ---
> Roy Smith
> roy@panix.com

View raw message