hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3840) Support pluggable speculative execution
Date Sun, 27 Jul 2008 19:26:31 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617327#action_12617327

Matei Zaharia commented on HADOOP-3840:

bq. Matei - it would really nice to attach a junit test since this is a new feature - maybe
along the lines of HADOOP-2214?

That's a good idea. I actually have some modifications to SleepJob where you can make some
number of tasks "hang". Shall I submit that as part of this patch, or as a patch for 2214?

> Support pluggable speculative execution
> ---------------------------------------
>                 Key: HADOOP-3840
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3840
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Matei Zaharia
>            Assignee: Matei Zaharia
>            Priority: Minor
>         Attachments: HADOOP-3840-v1.txt, HADOOP-3840-v2.patch
> HADOOP-3412 introduced an way to plug in a job scheduler for MapReduce. However, the
job schedulers all use JobInProgress.obtainNewMapTask or obtainNewReduceTask to select tasks
to run from each job, which uses a threshold-based speculative execution algorithm that has
several shortcomings (see JIRAs about the scheduler not speculating tasks that freeze after
having 80% progress for example). As a first step towards supporting better speculative execution
policies while not breaking backwards compatibility, it makes sense to make the speculative
execution policy pluggable. Luckily this is easy - we just need an interface around obtainNewMapTask
and obtainNewReduceTask. This JIRA suggests adding a TaskSelector abstract class which, given
a TaskTracker and a JobInProgress, chooses a task to run on the tracker. A default implementation
that uses the current methods in JobInProgress is provided. Both TaskSchedulers in trunk are
changed to use TaskSelector.
> In addition, there are methods to count how many speculative tasks a job needs, since
TaskInProgress.hasSpeculative() may not work if we change the algorithm for selecting speculative
tasks. This count is needed for some schedulers, such as a fair scheduler.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message