hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Sharma <sanjay.sha...@impetus.co.in>
Subject ReducerTask OOM failure
Date Thu, 29 Oct 2009 17:07:29 GMT
After moving to Cloudera 0.20.1 release and upgrade to 64GB machines, started facing occasional
OOMs with higher number of reducers when reducers started copying map outputs.

java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1539)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1432)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1285)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1216)

Turned out the problem was related to java int usein ReducerTask ShuffleRamManager reserve
method check-
                                     // Wait till the request can be fulfilled...
                                     while ((size + requestedSize) > maxSize) {

The check fails if (size+requestedSize) exceeds Integer.MAX_VALUE and "wraps around" into
a negative value thus failing the check. This forces all subsequent requests to keep on reserving
the RAM and finally crash the JVM.

Checked if it was related to HADOOP-3446 or being resolved by HADOOP-318.

Looks like the problem would not occur after HADOOP-318 as Arun uses "long" for size rather
than the current buggy "int".

Should a JIRA be raised to fix this for pre-0.21.0 release.
My fix was simple- while (((long)size + (long)requestedSize) > maxSize) {

I would be willing to create a JIRA and patch.


Follow our updates on www.twitter.com/impetuscalling.

* Impetus is sponsoring Internet Summit '09, a premier event in Raleigh, NC from November
4-5, 2009. Visit www.impetus.com/events.html for details.

NOTE: This message may contain information that is confidential, proprietary, privileged or
otherwise protected by law. The message is intended solely for the named addressee. If received
in error, please destroy and notify the sender. Any use of this email is prohibited when received
in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors, virus, interception
or interference.

View raw message