hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1924) Mappers running when reducers have finished
Date Wed, 07 Jul 2010 22:30:54 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886127#action_12886127
] 

Joydeep Sen Sarma commented on MAPREDUCE-1924:
----------------------------------------------

this cannot be done by default i think - because mappers can have side effects. but an option
seems desirable.

> Mappers running when reducers have finished
> -------------------------------------------
>
>                 Key: MAPREDUCE-1924
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1924
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Adam Kramer
>
> Occasionally, I will run jobs for which some reducers are able to finish but there are
still mappers running. I understand why sometimes mappers restart themselves even after the
reduce phase has begun--too many fetch-failures, for example. But in today's case, ALL of
the reducers have succeeded and are done, so these mappers really ARE unnecessary...so it
is a bug that they are running.
> Then, I killed one of them to see what was up--it just restarted itself. So, it is another
bug that mappers don't know they're unnecessary when they're killed.
> My guess is that if one of these jobs, which clearly finished at least once, were to
die randomly a few times, it would take the whole job with it--even though the job has completed.
> Whenever all reduce tasks are complete, Hadoop should kill ALL remaining map tasks and
immediately move to finish the job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message