hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2162) speculative execution does not handle cases where stddev > mean well
Date Thu, 28 Oct 2010 17:18:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925876#action_12925876
] 

Devaraj Das commented on MAPREDUCE-2162:
----------------------------------------

The progress rate also depends on the data size that the tasks consume. A task with a lot
of data to process might appear to be slower than tasks that have lesser data to process (and
this is true for reduces especially), and the current speculative execution logic might end
up speculating that. In the logic for choosing speculative tasks that's nowhere in the picture
currently. I wanted to change this to take into account the processing rate via MAPREDUCE-718
but never could get to it .. Processing rate might be better than progress rate for all the
calculations we do.

> speculative execution does not handle cases where stddev > mean well
> --------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2162
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2162
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Joydeep Sen Sarma
>
> the new speculation code only speculates tasks whose progress rate deviates from the
mean progress rate of a job by more than some multiple (typically 1.0) of stddev. stddev can
be larger than mean. which means that if we ever get into a situation where this condition
holds true - then a task with even 0 progress rate will not be speculated.
> it's not clear that this condition is self-correcting. if a job has thousands of tasks
- then one laggard task, inspite of not being speculated for a long time, may not be able
to fix the condition of stddev > mean.
> we have seen jobs where tasks have not been speculated for hours and this seems one explanation
why this may have happened. here's an example job with stddev > mean:
> DataStatistics: count is 6, sum is 1.7141054797775723E-8, sumSquares is 2.9381575958035014E-16
mean is 2.8568424662959537E-9 std() is 6.388093955645905E-9

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message