hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-766) ava.lang.OutOfMemoryError: Java heap space
Date Tue, 14 Apr 2009 23:50:14 GMT

    [ https://issues.apache.org/jira/browse/PIG-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698986#action_12698986
] 

Alan Gates commented on PIG-766:
--------------------------------

Can you attach your script, and give an idea of the data sizes you are trying to join?  Of
particular interest is the cardinality of the keys.

Also, if your inputs of very different sizes, you can try reducing the join order.  Pig always
materializes the first input's keys in memory and streams the second.  If one table has many
more instances of a given key than the other this can resolve the out of memory issue.

> ava.lang.OutOfMemoryError: Java heap space
> ------------------------------------------
>
>                 Key: PIG-766
>                 URL: https://issues.apache.org/jira/browse/PIG-766
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.2.0
>         Environment: Hadoop-0.18.3 (cloudera RPMs).
> mapred.child.java.opts=-Xmx1024m
>            Reporter: Vadim Zaliva
>
> My pig script always fails with the following error:
> Java.lang.OutOfMemoryError: Java heap space
>        at java.util.Arrays.copyOf(Arrays.java:2786)
>        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>        at java.io.FilterOutputStream.write(FilterOutputStream.java:80)
>        at org.apache.pig.data.DataReaderWriter.writeDatum(DataReaderWriter.java:213)
>        at org.apache.pig.data.DefaultTuple.write(DefaultTuple.java:291)
>        at org.apache.pig.data.DefaultAbstractBag.write(DefaultAbstractBag.java:233)
>        at org.apache.pig.data.DataReaderWriter.writeDatum(DataReaderWriter.java:162)
>        at org.apache.pig.data.DefaultTuple.write(DefaultTuple.java:291)
>        at org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:83)
>        at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
>        at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
>        at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:156)
>        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spillSingleRecord(MapTask.java:857)
>        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:467)
>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:101)
>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:219)
>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:208)
>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:86)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2198)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message