hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eduard Skaley <e.v.ska...@gmail.com>
Subject Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space
Date Mon, 05 Nov 2012 13:20:33 GMT
We increased mapreduce.reduce.memory.mb to 2GB and 
mapreduce.reduce.java.opts to 1.5GB.

Now we are getting livelocks for our jobs, map jobs don't start.

We are using CapacityScheduler because we had LiveLocks with FifoScheduler.

Does anybody have a clue ?
> By the way it happens on Yarn not on MRv1
>> each container gets 1GB at the moment.
>>> can you try increasing memory per reducer  ?
>>>
>>>
>>> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
>>> <mailto:e.v.skaley@gmail.com>> wrote:
>>>
>>>     Hello,
>>>
>>>     I'm getting this Error through job execution:
>>>
>>>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>>>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>>>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>>>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>>>     16:20:32 INFO  [main]                     Job - Task Id :
>>>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>>>     Error:
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>>     error in shuffle in fetcher#2
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>>>     Caused by: java.lang.OutOfMemoryError: Java heap space
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>>>
>>>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>>>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>>>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>>>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>>>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>>>
>>>     I have no clue what the issue could be for this. I googled this
>>>     issue and checked several sources of possible solutions but
>>>     nothing does fit.
>>>
>>>     I saw this jira entry which could fit:
>>>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>>>
>>>     Here somebody recommends to increase the value for the property
>>>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>>>     4096, but this is the value for our cluster.
>>>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>>>
>>>     The issue with the to small input files doesn't fit I think,
>>>     because the map phase reads 137 files with each 130MB. Block
>>>     Size is 128MB.
>>>
>>>     The cluster uses version 2.0.0-cdh4.1.1,
>>>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>>>
>>>     Thx
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> Nitin Pawar
>>>
>>
>


Mime
View raw message