hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From German Florez-Larrahondo <german...@samsung.com>
Subject RE: Re: memoryjava.lang.OutOfMemoryError related with number of reducer?
Date Tue, 15 Apr 2014 15:27:38 GMT
Lei

A good explanation of this can be found on the Hadoop The Definitive Guide by Tom White. 

Here is an excerpt that explains a bit the behavior at the reduce side and some possible tweaks
to control it. 

 

https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-6/shuffle-and-sort

 

 

 

From: leiwangouc@gmail.com [mailto:leiwangouc@gmail.com] 
Sent: Tuesday, April 15, 2014 9:29 AM
To: user; th
Subject: Re: Re: memoryjava.lang.OutOfMemoryError related with number of reducer?

 

Thanks Thomas. 

 

Anohter question.  I have no idea what is "Failed to merge in memory".  Does the 'merge' is
the shuffle phase in reducer side?  Why it is in memory?

Except the two methods(increase reducer number and increase heap size),  is there any other
alternatives to fix this issue? 

 

Thanks a lot.

 

 

  _____  

leiwangouc@gmail.com

 

From: Thomas Bentsen <mailto:th@bentzn.com> 

Date: 2014-04-15 21:53

To: user <mailto:user@hadoop.apache.org> 

Subject: Re: memoryjava.lang.OutOfMemoryError related with number of reducer?

When you increase the number of reducers they each have less to work

with provided the data is distributed evenly between them - in this case

about one third of the original work.

It is eessentially the same thing as increasing the heap size - it's

just distributed between more reducers.

 

/th

 

 

 

On Tue, 2014-04-15 at 20:41 +0800, leiwangouc@gmail.com wrote:

> I can fix this by changing heap size.

> But what confuse me is that when i change the reducer number from 24

> to 84, there's no this error.

> 

> 

> Any insight on this?

> 

> 

> Thanks

> Lei

> Failed to merge in memoryjava.lang.OutOfMemoryError: Java heap space

> at java.util.Arrays.copyOf(Arrays.java:2786)

> at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)

> at java.io.DataOutputStream.write(DataOutputStream.java:90)

> at java.io.DataOutputStream.writeUTF(DataOutputStream.java:384)

> at java.io.DataOutputStream.writeUTF(DataOutputStream.java:306)

> at org.apache.pig.data.utils.SedesHelper.writeChararray(SedesHelper.java:66)

> at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:543)

> at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:435)

> at org.apache.pig.data.utils.SedesHelper.writeGenericTuple(SedesHelper.java:135)

> at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:613)

> at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:604)

> at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:447)

> at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:435)

> at org.apache.pig.data.utils.SedesHelper.writeGenericTuple(SedesHelper.java:135)

> at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:613)

> at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:443)

> at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:435)

> at org.apache.pig.data.utils.SedesHelper.writeGenericTuple(SedesHelper.java:135)

> at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:613)

> at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:443)

> at org.apache.pig.data.BinSedesTuple.write(BinSedesTuple.java:41)

> at org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:123)

> at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:100)

> at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:84)

> at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:188)

> at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:1145)

> at org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1456)

> at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)

> at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:99)

> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:201)

> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:163)

> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:51)

> 

> ______________________________________________________________________

> leiwangouc@gmail.com

 

 


Mime
View raw message