hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Zhang <zjf...@gmail.com>
Subject Re: Pig filter by fails at backend , what am i doing wrong?
Date Tue, 20 Jul 2010 05:03:59 GMT
It looks like a bug of Pig.
I try the following script:

a = load 'data/a.txt' as (b:bag{t:tuple(f1:int,f2:int)});
result = foreach a generate FLATTEN(b) as c;
describe result;

the output is
result: {c: int,f2: int}
The c is considered one field of tuple other than tuple


On Tue, Jul 20, 2010 at 6:00 AM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> Hi,
>
> I would greatly appreciate somebody's help with the following pig error
> during MR
>
> all mappers fail with the following stack trace
>
> java.lang.ClassCastException: java.lang.Integer cannot be cast to
> org.apache.pig.data.Tuple
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POIsNull.getNext(POIsNull.java:152)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POAnd.getNext(POAnd.java:67)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLimit.getNext(POLimit.java:85)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:255)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:232)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52)
>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
>
>
>
> the pig script fragment causing this is as follows :
> IMP_F2 = foreach IMP_F1 generate ... , FLATTEN(contentRatings) as
> contentRating;
> IMP_F3 = filter IMP_F2 by contentRating is not null and
> contentRating.vendorId==1
>
> if i remove IMP_F3 line then the job goes thru but adding IMP_F3
> filtering causes this.
> describe IMP_F2 produces
>
> IMP_F2: {... ,contentRating: (vendorId: int, ... ), ... }
>
>
> i also tried casts like 'filter by ...
> (int)(contentRating.vendorId)==1 which did not change anything.
>
> Any ideas for workaround are appreciated.
>
> Thanks in advance.
> -Dmitriy
>



-- 
Best Regards

Jeff Zhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message