hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitry Pushkarev" <u...@stanford.edu>
Subject RE: Resource allocation for map tasks
Date Sun, 11 Apr 2010 01:54:12 GMT
I'll try using Yahoo! Version of 20.9, Thanks.

Right now I'm still on 0.19.0, what is the expected date of the 0.21
release?

-----Original Message-----
From: Arun C Murthy [mailto:acm@yahoo-inc.com] 
Sent: Saturday, April 10, 2010 5:07 PM
To: common-user@hadoop.apache.org
Subject: Re: Resource allocation for map tasks

On Apr 10, 2010, at 4:02 PM, Dmitry Pushkarev wrote:
> I have a cluster with where each node can run up to 8 map tasks (one  
> task
> per core), now we realized that we need to run another type of job  
> that has
> much larger memory requirements, which will only allow up to 4 tasks  
> to be
> run on each node. Is it possible to somehow specify that each map  
> process of
> that new task "occupies" two map slots so that at most 4 such maps  
> will be
> launched?
>

Which MR scheduler are you running?

The CapacityScheduler
(http://hadoop.apache.org/common/docs/r0.20.0/capacity_scheduler.html 
) has exactly the feature you are looking for, it's called 'High RAM  
jobs'. I'm not sure whether the FairScheduler has this feature, I'll  
let someone more knowledgeable comment on the FS.

Unfortunately, this feature in CS is available only in trunk/ 
hadoop-0.21 which hasn't released yet.

We, at Yahoo!, run a version hadoop-0.20 which includes a backport for  
this feature in the CS:
http://github.com/yahoo/hadoop-common/commits/yahoo-hadoop-0.20.9-stable

Arun



Mime
View raw message