hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leon Mergen <l.p.mer...@solatis.com>
Subject OutOfMemoryError with map jobs
Date Sat, 06 Sep 2008 13:35:37 GMT

I'm currently developing a map/reduce program that emits a fair amount of maps per input record
(around 50 - 100), and I'm getting OutOfMemory errors:

2008-09-06 15:28:08,993 ERROR org.apache.hadoop.mapred.pipes.BinaryProtocol: java.lang.OutOfMemoryError:
Java heap space
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$BlockingBuffer.reset(MapTask.java:564)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:440)
        at org.apache.hadoop.mapred.pipes.OutputHandler.output(OutputHandler.java:55)
        at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:117)

It is a reproducible error which occurs at the same percentage all the time - when I emit
less maps per input record, the problem goes away.

Now, I have tried editing conf/hadoop-env.sh to increase the HADOOP_HEAPSIZE to 2000MB and
set `export HADOOP_TASKTRACKER_OPTS="-Xms32m -Xmx2048m"`, but the problem persists at the
exact same place.

Now, my use case doesn't really look that spectacular; is this a common problem, and if so,
what are the usual ways to get around this?

Thanks in advance for a response!


Leon Mergen

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message