hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ratan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4428) Job Priorities are not handled properly
Date Tue, 21 Oct 2008 05:40:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641274#action_12641274
] 

Vivek Ratan commented on HADOOP-4428:
-------------------------------------

bq. We don't look at jobs in any other queue, if the job currently initialized is not yet
running.

I wanted to consider this in some detail. When initTasks() returns, the job is still not in
a running state, because its setup task needs to run first. For discussion sake, assume we
have queues Q1, Q2 ... Qn and that we're considering queues in that order, starting with Q1.
So after we call initTasks() on a job in Q1 (say, J1), we have the following options, in order
to find a task to run: 
1. We can look at the next job in Q1 This is not a good option since we'll face the same situation
- we'll call initTasks() for the next job, and then look at the next job, and so on. 
1. We can look at jobs in the next queue. This is a viable option. It does seem a bit unfair,
because you're penalizing Q1 for the duration of the time it takes for J1's setup task to
run, but you could equally well argue that this unfairness is temporary and is equally applicable
to all queues. 
1. We can return nothing to the the TT. As a result, all TTs that send heartbeats to the JT
during the time that J1's setup task is running, will get nothing to run. Most setup tasks
should take a couple of heartbeats to run, so this won't be a frequent problem, but if the
setup task contains user code that does a bunch of stuff, the problem is exacerbated. 

Upon further reflection, I'd argue for the second approach where we move on to the next queue.
Returning nothing to the TTs causes unnecessary under-utilization. 

The right way to do things, IMO, is get the setup/cleanup tasks out of initTasks(), which
I'll argue elsewhere, but this problem (of initTasks() not necessarily changing the job's
state to RUNNING) can rise up again if we decide to call initTasks() in a separate thread,
the way it's done in the default scheduler. 

> Job Priorities are not handled properly 
> ----------------------------------------
>
>                 Key: HADOOP-4428
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4428
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Cluster:  106 TTs  MapCapacity=212, ReduceCapacity=212
> Single Queue=default, User Limit=25, Priorities = Yes.
> Using hadoop branch 0.19 revision=705159 
>            Reporter: Karam Singh
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4428-20081017.1.txt, HADOOP-4428-20081020.txt, HADOOP-4428.patch
>
>
> Job Priorities are not handled properly 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message