hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: Java Heap memory error : Limit to 2 Gb of ShuffleRamManager ?
Date Thu, 06 Dec 2012 18:41:27 GMT
Oliver,

 Sorry, missed this.

 The historical reason, if I remember right, is that we used to have a single byte buffer
and hence the limit.

 We should definitely remove it now since we don't use a single buffer. Mind opening a jira?


 http://wiki.apache.org/hadoop/HowToContribute

thanks!
Arun

On Dec 6, 2012, at 8:01 AM, Olivier Varene - echo wrote:

> anyone ?
> 
> Début du message réexpédié :
> 
>> De : Olivier Varene - echo <varene@echo.fr>
>> Objet : ReduceTask > ShuffleRamManager : Java Heap memory error
>> Date : 4 décembre 2012 09:34:06 HNEC
>> À : mapreduce-user@hadoop.apache.org
>> Répondre à : mapreduce-user@hadoop.apache.org
>> 
>> 
>> Hi to all,
>> first many thanks for the quality of the work you are doing : thanks a lot
>> 
>> I am facing a bug with the memory management at shuffle time, I regularly get
>> 
>> Map output copy failure : java.lang.OutOfMemoryError: Java heap space
>> 	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1612)
>> 
>> 
>> reading the code in org.apache.hadoop.mapred.ReduceTask.java file
>> 
>> the "ShuffleRamManager" is limiting the maximum of RAM allocation to Integer.MAX_VALUE
* maxInMemCopyUse ?
>> 
>> maxSize = (int)(conf.getInt("mapred.job.reduce.total.mem.bytes",
>>            (int)Math.min(Runtime.getRuntime().maxMemory(), Integer.MAX_VALUE))
>>          * maxInMemCopyUse);
>> 
>> Why is is so ?
>> And why is it concatened to an Integer as its raw type is long ?
>> 
>> Does it mean that you can not have a Reduce Task taking advantage of more than 2Gb
of memory ?
>> 
>> To explain a little bit my use case, 
>> I am processing some 2700 maps (each working on 128 MB block of data), and when the
reduce phase starts, it sometimes stumbles with java heap memory issues.
>> 
>> configuration is : java 1.6.0-27
>> hadoop 0.20.2
>> -Xmx1400m
>> io.sort.mb 400
>> io.sort.factor 25
>> io.sort.spill.percent 0.80
>> mapred.job.shuffle.input.buffer.percent 0.70
>> ShuffleRamManager: MemoryLimit=913466944, MaxSingleShuffleLimit=228366736
>> 
>> I will decrease 
>> mapred.job.shuffle.input.buffer.percent to limit the errors, but I am not fully confident
for the scalability of the process.
>> 
>> Any help would be welcomed
>> 
>> once again, many thanks
>> Olivier
>> 
>> 
>> P.S: sorry if I misunderstood the code, any explanation would be really welcomed
>> 
>> -- 
>>  
>>  
>>  
>> 
>> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Mime
View raw message