hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5168) Reducer can OOM during shuffle because on-disk output stream not released
Date Fri, 19 Apr 2013 06:03:15 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13636109#comment-13636109
] 

Hadoop QA commented on MAPREDUCE-5168:
--------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12579487/MAPREDUCE-5168-branch-0.23.patch
  against trunk revision .

    {color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3538//console

This message is automatically generated.
                
> Reducer can OOM during shuffle because on-disk output stream not released
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5168
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5168
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.7, 2.0.5-beta
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: MAPREDUCE-5168-branch-0.23.patch, MAPREDUCE-5168.patch
>
>
> If a reducer needs to shuffle a map output to disk, it opens an output stream and writes
the data to disk.  However it does not release the reference to the output stream within the
MapOutput, and the output stream can have a 128K buffer attached to it.  If enough of these
on-disk outputs are queued up waiting to be merged, it can cause the reducer to OOM during
the shuffle phase.  In one case I saw there were 1200 on-disk outputs queued up to be merged,
leading to an extra 150MB of pressure on the heap due to the output stream buffers that were
no longer necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message