hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2141) speculative execution start up condition based on completion time
Date Thu, 15 Nov 2007 19:06:43 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542850
] 

Arun C Murthy commented on HADOOP-2141:
---------------------------------------

Here are some thoughts about how to go about it:

I propose we track average completion time of maps and reduces (separately, of course) and
spawn speculative tasks when the tasks are 1.5x or 2x slower (should we be more/less conservative).
However, to ensure that the system isn't inundated with too many speculative tasks I propose
that this comes into effect only when more than 90% or 95% of the tasks (of that kind) are
complete.

So, if we have 2000 reduces and average completion time of reduces is 60 minutes, we should
launch speculative reduces iff 1800 reduces are done and a reducer has run for more than 90minutes,
we spawn a new reduce task.

Should we have disable this for maps? 
Should we have separate policies for maps and reduces (percentage and the running-time lag
vis-a-vis completed tasks)?

Thoughts? 

> speculative execution start up condition based on completion time
> -----------------------------------------------------------------
>
>                 Key: HADOOP-2141
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2141
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Koji Noguchi
>            Assignee: Arun C Murthy
>             Fix For: 0.16.0
>
>
> We had one job with speculative execution hang.
> 4 reduce tasks were stuck with 95% completion because of a bad disk. 
> Devaraj pointed out 
> bq . One of the conditions that must be met for launching a speculative instance of a
task is that it must be at least 20% behind the average progress, and this is not true here.
> It would be nice if speculative execution also starts up when tasks stop making progress.
> Devaraj suggested 
> bq. Maybe, we should introduce a condition for average completion time for tasks in the
speculative execution check. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message