hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5187) Provide an option to turn off priorities in jobs
Date Wed, 11 Feb 2009 01:03:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672472#action_12672472
] 

Matei Zaharia commented on HADOOP-5187:
---------------------------------------

Yes, the design would be as follows:

* Each job belongs to a pool. Pools may be marked as either FIFO or fair sharing.
* Each pool has a minimum share (guaranteed share) defined in the config. Any excess capacity
is divided between pools according to fair sharing, as in the current scheduler.
* Each pool takes its min share and fair share and divides it among the jobs inside the pool:
   * For a fair sharing pool, we divide the min and fair shares equally among jobs as happens
now (well, technically using weights)
   * For a FIFO pool, we give as much of the min share as possible to the first job, give
any excess to the second job (if the first job didn't have enough unlaunched tasks to consume
the pool's full share), etc until we run out. Same with fair share.
* Now for the purpose of scheduling, we can have one big list of runnable jobs, each of which
has a min share and a fair share. We sort this list first by whether the job is below its
min share (breaking ties by how long it's been below this), and then for the remaining jobs
by how far each job is below its fair share (as a percent). We then scan through it to pick
tasks, using the same wait technique proposed in HADOOP-4667 to skip jobs that don't happen
to have local tasks for the current heartbeat.

On top of this we can have any logic we want for user limits, when to initialize jobs, etc
(as we've been talking about in other JIRAs).

I think this should work without very complicated code, and will be much easier to understand
than the current deficit stuff. It also leaves the option open to have pools with scheduling
disciplines other than FIFO or fair sharing, since the job of each pool is just to subdivide
its own min and fair shares among the jobs within it. This might enable something like HADOOP-5199.

> Provide an option to turn off priorities in jobs
> ------------------------------------------------
>
>                 Key: HADOOP-5187
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5187
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/fair-share
>            Reporter: Hemanth Yamijala
>            Priority: Minor
>
> The fairshare scheduler can define pools mapping to queues (as defined in the capacity
scheduler - HADOOP-3445). When used in this manner, one can imagine queues set up to be used
by users who come from disparate teams or organizations (say a default queue). For such a
queue, it makes sense to ignore job priorities and consider the queue as strict FIFO, as it
is difficult to compare priorities of jobs from different users. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message