hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4665) Add preemption to the fair scheduler
Date Wed, 01 Apr 2009 13:37:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694526#action_12694526

Vinod K V commented on HADOOP-4665:

Some more miscellaneous points:
 - With the introduction of new timeouts, I think the importance of a template file for the
allocations increases. I remember you saying something about it on some jira. Have you filed

 - This patch adds FairSchedulerEventLog for logging various events in the scheduler in machine-readable
format. But there is no place from where utilities can determine the format of the log records:
How should we track the event log records' format, add some schema file? Or alter the logs
to be a list of key-value pairs similar to JobHistory instead of just values?

 - There is a lot of common code between {code}int org.apache.hadoop.mapred.FairScheduler.preemptTasks(JobInProgress
job, TaskType type, int maxToPreempt){code} and {code}int org.apache.hadoop.mapred.CapacityTaskScheduler.MapSchedulingMgr.killTasksFromJob(JobInProgress
job, int tasksToKill){code}. In fact most of it is the same. I think we should somehow try
to refactor this common code. Don't know if we want to do it in this jira itself or not.

> Add preemption to the fair scheduler
> ------------------------------------
>                 Key: HADOOP-4665
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4665
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/fair-share
>            Reporter: Matei Zaharia
>            Assignee: Matei Zaharia
>             Fix For: 0.21.0
>         Attachments: fs-preemption-v0.patch, hadoop-4665-v1.patch, hadoop-4665-v1b.patch,
hadoop-4665-v2.patch, hadoop-4665-v3.patch, hadoop-4665-v4.patch
> Task preemption is necessary in a multi-user Hadoop cluster for two reasons: users might
submit long-running tasks by mistake (e.g. an infinite loop in a map program), or tasks may
be long due to having to process large amounts of data. The Fair Scheduler (HADOOP-3746) has
a concept of guaranteed capacity for certain queues, as well as a goal of providing good performance
for interactive jobs on average through fair sharing. Therefore, it will support preempting
under two conditions:
> 1) A job isn't getting its _guaranteed_ share of the cluster for at least T1 seconds.
> 2) A job is getting significantly less than its _fair_ share for T2 seconds (e.g. less
than half its share).
> T1 will be chosen smaller than T2 (and will be configurable per queue) to meet guarantees
quickly. T2 is meant as a last resort in case non-critical jobs in queues with no guaranteed
capacity are being starved.
> When deciding which tasks to kill to make room for the job, we will use the following
> - Look for tasks to kill only in jobs that have more than their fair share, ordering
these by deficit (most overscheduled jobs first).
> - For maps: kill tasks that have run for the least amount of time (limiting wasted time).
> - For reduces: similar to maps, but give extra preference for reduces in the copy phase
where there is not much map output per task (at Facebook, we have observed this to be the
main time we need preemption - when a job has a long map phase and its reducers are mostly
sitting idle and filling up slots).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message