hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1204) Fair Scheduler preemption may preempt tasks running in slots unusable by the preempting job
Date Tue, 10 Nov 2009 22:41:28 GMT
Fair Scheduler preemption may preempt tasks running in slots unusable by the preempting job
-------------------------------------------------------------------------------------------

                 Key: MAPREDUCE-1204
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1204
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: contrib/fair-share
    Affects Versions: 0.21.0
            Reporter: Todd Lipcon


The current preemption code works by first calculating how many tasks need to be preempted
to satisfy the min share constraints, and then killing an equal number of tasks from other
jobs, sorted to favor killing of young tasks. This works fine for the general case, but there
are some edge cases where this can cause problems.

For example, if the preempting job has blacklisted ("marked flaky") a particular task tracker,
and that tracker is running the youngest task, preemption can still kill that task. The preempting
job will then refuse that slot, since the tracker has been blacklisted. The same task that
just got killed then gets rescheduled in that slot. This repeats ad infinitum until a new
slot opens in the cluster.

I don't have a good test case for this, yet, but logically it is possible.

One potential fix would be to add an API to JobInProgress that functions identically to obtainNewMapTask
but does not schedule the task. The preemption code could then use this while iterating through
the sorted preemption list to check that the preempting jobs can actually make use of the
candidate slots before killing them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message