hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Renaud Delbru <renaud.del...@deri.org>
Subject Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2
Date Tue, 25 Jan 2011 15:57:43 GMT
Our experience with the Capacity Scheduler was not what we expected and 
what you describe. But, it might be due to a wrong comprehension of the 
configuration parameters.
The problem is the following:
mapred.capacity-scheduler.queue.<queue-name>.capacity: Percentage of the 
number of slots in the cluster that are *guaranteed* to be available for 
jobs in this queue.
mapred.capacity-scheduler.queue.<queue-name>.minimum-user-limit-percent: 
Each queue enforces a limit on the percentage of resources allocated to 
a user at any given time, if *there is competition for them*.

So, in fact, it seems that if there is no competition, and that the 
cluster is fully available, the scheduler will assign the full cluster 
to the job, and will not limit the number of concurrent tasks. It seemed 
to us that the only way to enforce a strong limit was to use the Fair 
Scheduler of hadoop 0.21.0 which includes a new configuration parameters 
'maxMaps'.

Am I right, or did we miss something ?

cheers
-- 
Renaud Delbru

On 25/01/11 15:20, Harsh J wrote:
> Capacity Scheduler (or a version of it) does ship with the 0.20
> release of Hadoop and is usable. It can be used to assign queues with
> a limited capacity for each, which your jobs must appropriately submit
> to if you want them to utilize only the assigned fraction of your
> cluster for its processing.
>
> On Tue, Jan 25, 2011 at 5:19 PM, Renaud Delbru<renaud.delbru@deri.org>  wrote:
>> Hi,
>>
>> we would like to limit the number of maximum tasks per job on our hadoop
>> 0.20.2 cluster.
>> Is the Capacity Scheduler [1] will allow to do this ? Is it correctly
>> working on hadoop 0.20.2 (I remember a  few months ago, we were looking at
>> it, but it seemed incompatible with hadoop 0.20.2).
>>
>> [1] http://hadoop.apache.org/common/docs/r0.20.2/capacity_scheduler.html
>>
>> Regards,
>> --
>> Renaud Delbru
>>
>
>


Mime
View raw message