hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Kramer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1060) JT should kill running maps when all the reducers have completed
Date Fri, 13 Aug 2010 23:19:19 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898468#action_12898468

Adam Kramer commented on MAPREDUCE-1060:

I argue that this is a bug, not an improvement. If the mapper completes successfully on the
first try but then fails on the unnecessary 2nd..5th try, the whole job will fail unnecessarily.

Also, this is still occurring. This has been happening a lot lately. It is especially frequent
for jobs whose mappers take a long time--because the map node may lose the task tracker for
the jobs that finish quickly before the later-ending jobs have finished.

Is there any case in which the side-effects of a second-run mapper would be such that the
whole job SHOULD fail even though all the reducers have finished?

> JT should kill running maps when all the reducers have completed
> ----------------------------------------------------------------
>                 Key: MAPREDUCE-1060
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1060
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Jothi Padmanabhan
> We have seen some situations where maps are still running when all the reducers have
completed. This could happen because of lost TT's, interplay of speculative tasks with bad
TT's etc. If the maps take a long time to run, it unnecessarily delays the job completion
time, as this map output is not required anyways. The JT should possibly kill running maps
when all the reducers have completed.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message