hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-570) Map tasks may fail due to out of memory, if the number of reducers are moderately big
Date Mon, 02 Oct 2006 19:14:20 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-570?page=all ]

Doug Cutting resolved HADOOP-570.
---------------------------------

    Resolution: Duplicate

This is a duplicate of HADOOP-331.

> Map tasks may fail due to out of memory, if the number of reducers are moderately big
> -------------------------------------------------------------------------------------
>
>                 Key: HADOOP-570
>                 URL: http://issues.apache.org/jira/browse/HADOOP-570
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>
> Map tasks may fail due to out of memory, if the number of reducers are moderately big.

> In my case, I set child task heap size to 1GB, turned on compression for the mapoutput
files. 
> The average size of input records is about 30K (I don't know the variation though). 
> A lot of map tasks failed due to out of memory when the number of reducers was at 400
and higher.
> The number of reducers can be somewhat higher (as high as 800) if the compression for
the mapoutput files was off).
> This problem will impose a hard limit on the scalability of map/reduce clusters.
> One possible solution to this problem is to let the mapper to write out single map output
file, 
> and then to perform sort/partition as a separate phrase. 
> his will also make it unnecessary for  the reducers to perform sort on individual portions
from mappers. 
> Rather, the reducers should just perform merge operations on the map output files directly.

> This may even allow the possibility of dynamically collect some statistics of  the map
outputs and 
> use the stats to drive the partition on the mapper side, and obtain the optimal merge
plan on the reducer side!
>  

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message