hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2062) speculative execution is too aggressive under certain conditions
Date Tue, 14 Sep 2010 00:11:35 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909059#action_12909059

Joydeep Sen Sarma commented on MAPREDUCE-2062:

another thing we have noticed is that progress rate (especially the reducer's) is usually
pretty low (compared to mean) when the task initially starts (which causes lots of false speculations).
However - the absolute progress rate of the speculated tasks is not bad at all (most of the
speculated tasks had a progress rate that would have taken them to 100% within 3-4 minutes).

One heuristic that seemed obvious after looking at this was that we should have a upper bound
on the progress rate - where above that progress rate - speculation does not make sense (regardless
of mean/stddev). The proposal is to be able to configure this as a 'minimum_duration' setting
on mappers/reducers. if the mapper/reducer is projected to finish within this duration - no
speculation will be done. setting the duration to a small number like 3-4 minutes would weed
out a lot of excessive speculators.

> speculative execution is too aggressive under certain conditions
> ----------------------------------------------------------------
>                 Key: MAPREDUCE-2062
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2062
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>         Environment: hadoop-20 with HADOOP-2141
>            Reporter: Joydeep Sen Sarma
> The function canBeSpeculated has subtle bugs that cause too much speculation in certain
> - it compares the current progress of the task with the last observed mean of all the
tasks. if only one task is in question - then the progress rate decays as time progresses
(in the absence of updates) and std-dev is zero. So a job with a single reducer or mapper
is almost always speculated.
> - is only a single task has reported progress - then the stddev is zero. so other tasks
may be speculated aggressively.
> - several tasks take a while to report progress initially. they seem to get speculated
as soon as speculative-lag is over. the lag should be configurable at the minimum.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message