hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2201) Quicker preemption causes excessive preemption in FairScheduler
Date Thu, 25 Nov 2010 20:01:37 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935860#action_12935860

Matei Zaharia commented on MAPREDUCE-2201:

Why can't the scheduler kill that many tasks in 1 minute? Does it have something to do with
the heartbeat interval? (I believe that two heartbeats are required -- one to kill the task
and one to get back the fact that it was killed and request a new task.)

> Quicker preemption causes excessive preemption in FairScheduler
> ---------------------------------------------------------------
>                 Key: MAPREDUCE-2201
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2201
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/fair-share
>            Reporter: Joydeep Sen Sarma
> One problem we are seeing is where FairScheduler repeatedly preempts for the same job.
This is presumably because our preemption interval is set to a low number (1 minute). FS queues
up N tasks to be killed - but in 1 min it is not able to kill  and schedule new tasks on all
these slots. As a result, after 1 min - it again preempts a whole bunch of tasks.
> We could (and probably will) workaround this by increasing the preemption interval. However
- this gives us a hard tradeoff between accurate preemption and timely preemption. Not good.
Ideally we want to make the first set of preemptions quickly (to provide responsive behavior
to new jobs for example) - but wait (to make sure that the kill actions have actually been
processed) thereafter.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message