hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Ciemiewicz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-746) Works in --exectype local, fails on grid - ERROR 2113: SingleTupleBag should never be serialized
Date Fri, 03 Apr 2009 00:47:13 GMT

    [ https://issues.apache.org/jira/browse/PIG-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695198#action_12695198
] 

David Ciemiewicz commented on PIG-746:
--------------------------------------

I'd still like to use the combiner in other instances in my combined Pig scripts (I concatentate
several pig scripts together to create compound Pig scripts).

It would be nice if Pig had a per statement option to turn off or force on the combiner.

In the mean time, I discovered a "feature" (flaw?) in Pig that turns off the combiner - perform
a scalar operation (such as +0L) on the Algebraic aggregation function.

D = foreach B generate
        group,
        SUM(A.matched) + 0L  as matchedcount, -- +0L :flaw" turns off combiner
        A;
describe D;

I have tried this workaround and it works, at least in the current version of Pig.  Until
someone figures out how to permit use of the combiner for combined Algebraic and scalar  operations.

> Works in --exectype local, fails on grid - ERROR 2113: SingleTupleBag should never be
serialized
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-746
>                 URL: https://issues.apache.org/jira/browse/PIG-746
>             Project: Pig
>          Issue Type: Bug
>            Reporter: David Ciemiewicz
>
> The script below works on Pig 2.0 local mode but fails when I run the same program on
the grid.
> I was attempting to create a workaround for PIG-710.
> Here's the error:
> {code}
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2113: SingleTupleBag
should never be serialized
> or serialized.
>         at org.apache.pig.data.SingleTupleBag.write(SingleTupleBag.java:129)
>         at org.apache.pig.data.DataReaderWriter.writeDatum(DataReaderWriter.java:147)
>         at org.apache.pig.data.DefaultTuple.write(DefaultTuple.java:291)
>         at org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:83)
>         at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
>         at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:439)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:101)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:219)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:208)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:86)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
> {code}
> Here's the program:
> {code}
> A = load 'filterbug.data' using PigStorage() as ( id, str );
> A = foreach A generate
>         id,
>         str,
>         (
>         str matches 'hello' or
>         str matches 'hello'
>         ? 1 : 0
>         )                       as matched;
> describe A;
> B = group A by ( id );
> describe B;
> D = foreach B generate
>         group,
>         SUM(A.matched)  as matchedcount,
>         A;
> describe D;
> E = filter D by matchedcount > 0;
> describe E;
> F = foreach E generate
>         FLATTEN(A);
> describe F;
> dump F;
> {code}
> Here's the data filterbug.data
> {code}
> a       hello
> a       goodbye
> b       goodbye
> c       hello
> c       hello
> c       hello
> e       what
> {code}
> 		

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message