hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output
Date Tue, 23 Jul 2013 19:04:50 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717487#comment-13717487

Jason Lowe commented on MAPREDUCE-5251:

Thanks for the update, Aswhin.  Couple of minor things:

* reportLocalError probably should just compute the hostname itself rather than requiring
callers to do so
* there is whitespace missing between arguments added in the latest patch (which will be fixed
if we remove the reduceHost arg to reportLocalError)
> Reducer should not implicate map attempt if it has insufficient space to fetch map output
> -----------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-5251
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.7, 2.0.4-alpha
>            Reporter: Jason Lowe
>            Assignee: Ashwin Shankar
>         Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, MAPREDUCE-5251-4.txt
> A job can fail if a reducer happens to run on a node with insufficient space to hold
a map attempt's output.  The reducer keeps reporting the map attempt as bad, and if the map
attempt ends up being re-launched too many times before the reducer decides maybe it is the
real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and hopefully it
will run on another node that has sufficient space to complete the shuffle.  Reporting the
map attempt is bad and relaunching the map task doesn't change the fact that the reducer can't
hold the output.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message