hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: OOM Error Map output copy.
Date Fri, 09 Dec 2011 18:29:15 GMT
Moving to mapreduce-user@, bcc common-user@. Please use project specific lists.


If you average as 0.5G output per-map, it's 5000 maps *0.5G -> 2.5TB over 12 reduces i.e.
nearly 250G per reduce - compressed! 

If you think you have 4:1 compression you are doing nearly a Terabyte per reducer... which
is way too high!

I'd recommend you bump to somewhere along 1000 reduces to get to 2.5G (compressed) per reducer
for your job. If your compression ratio is 2:1, try 500 reduces and so on.

If you are worried about other users, use the CapacityScheduler and submit your job to a queue
with a small capacity and max-capacity to restrict your job to 10 or 20 concurrent reduces
at a given point.


On Dec 7, 2011, at 10:51 AM, Niranjan Balasubramanian wrote:

> All 
> I am encountering the following out-of-memory error during the reduce phase of a large
> Map output copy failure : java.lang.OutOfMemoryError: Java heap space
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1669)
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1529)
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1378)
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1310)
> I tried increasing the memory available using mapped.child.java.opts but that only helps
a little. The reduce task eventually fails again. Here are some relevant job configuration
> 1. The input to the mappers is about 2.5 TB (LZO compressed). The mappers filter out
a small percentage of the input ( less than 1%).
> 2. I am currently using 12 reducers and I can't increase this count by much to ensure
availability of reduce slots for other users. 
> 3. mapred.child.java.opts --> -Xms512M -Xmx1536M -XX:+UseSerialGC
> 4. mapred.job.shuffle.input.buffer.percent	--> 0.70
> 5. mapred.job.shuffle.merge.percent	--> 0.66
> 6. mapred.inmem.merge.threshold	--> 1000
> 7. I have nearly 5000 mappers which are supposed to produce LZO compressed outputs. The
logs seem to indicate that the map outputs range between 0.3G to 0.8GB. 
> Does anything here seem amiss? I'd appreciate any input of what settings to try. I can
try different reduced values for the input buffer percent and the merge percent.  Given that
the job runs for about 7-8 hours before crashing, I would like to make some informed choices
if possible.
> Thanks. 
> ~ Niranjan.

View raw message