hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1723) Capacity Scheduler should allow configuration of Map & Reduce task slots independently per queue
Date Fri, 23 Apr 2010 07:00:53 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860147#action_12860147

Hemanth Yamijala commented on MAPREDUCE-1723:

Hmm. In HADOOP-3445 (God, I am surprised I still remember the number, *smile*) which introduced
the capacity scheduler, Vivek had argued to have separate percentages for map and reduce capacities.
At the time though, consensus drove towards having a single number.  I think a big factor
driving that decision was the absence of limits and presence of pre-emption. At that time,
queues could not impose limits and hence spare capacity could be always used elsewhere; and
pre-emption was meant to ensure that queues could get their 'guaranteed' capacity when required.

With time, limits have come in and pre-emption has gone out. There is this valid use case
that has come up. To me it seems like there are two ways to approach this problem. One is
to do the enhancement proposed in the JIRA. Two is to re-introduce pre-emption. Clearly the
first option is simple and easy to understand; I can think of ways we can keep the spec and
implementation simple for the default case and still support this special requirement. The
only thing bothering me is that it seems to be handling a specific type of cluster setup (i.e.
the kind of queue and job profile that is described). The second option is clearly quite complicated.
But we've had repeated cases from people asking for pre-emption in the scheduler, and I think
it is a topic that's going to die only when it gets implemented. *smile*.

As a side note while we are still discussing this, Subramaniam, what is the proportion of
map and reduce slots in your cluster ? Are they the same ?

> Capacity Scheduler should allow configuration of Map & Reduce task slots independently
per queue
> ------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-1723
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1723
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.20.1
>         Environment: all
>            Reporter: Subramaniam Krishnan
>             Fix For: 0.20.3
> The Capacity Scheduler allows configuration of percentage of task slots per queue. We
have a scenario in which our biggest queue (50% quota) has Jobs with mainly Map tasks &
we need to enforce strict capacity limits per queue due to SLA requirements. So other smaller
queues which require Reduce tasks gets starved even though the Reduce slots are idle. The
Grid can be more efficiently utilized if Capacity Scheduler allows configuration of Map &
Reduce task slots capacity independently per queue.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message