hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Ryza (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4655) MergeManager.reserve can OutOfMemoryError if more than 10% of max memory is used on non-MapOutputs
Date Thu, 27 Sep 2012 23:29:07 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465215#comment-13465215
] 

Sandy Ryza commented on MAPREDUCE-4655:
---------------------------------------

Looked at a heap dump, and it appears that the problem was caused by Avro holding on to a
reference after it was done with it.  Filed AVRO-1175.
                
> MergeManager.reserve can OutOfMemoryError if more than 10% of max memory is used on non-MapOutputs
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4655
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4655
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.1-alpha
>            Reporter: Sandy Ryza
>
> The MergeManager does a memory check, using a limit that defaults to 90% of Runtime.getRuntime().maxMemory().
Allocations that would bring the total memory allocated by the MergeManager over this limit
are asked to wait until memory frees up. Disk is used for single allocations that would be
over 25% of the memory limit.
> If some other part of the reducer were to be using more than 10% of the memory. the current
check wouldn't stop an OutOfMemoryError.
> Before creating an in-memory MapOutput, a check can be done using Runtime.getRuntime().freeMemory(),
waiting until memory is freed up if it fails.
> 12/08/17 10:36:29 INFO mapreduce.Job: Task Id : attempt_1342723342632_0010_r_000005_0,
Status : FAILED 
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle
in fetcher#6 
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123) 
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371) 
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:416) 
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)

> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147) 
> Caused by: java.lang.OutOfMemoryError: Java heap space 
> at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)

> at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)

> at org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)

> at org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)

> at org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)

> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:327) 
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:273) 
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:153)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message