hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: Pig filter by fails at backend , what am i doing wrong?
Date Tue, 20 Jul 2010 05:36:33 GMT
isn't i am doing the same?

I actually tried to flatten bag then flatten tuple as well. It seems
internally still construct the tree that includes projection and gives the
same error. Which of course makes sense, it builds the vistor tree behind
the scenes, which should contain projection sooner or later.


i actually stumbled upon the same error in a comnpletely different context
and i probably try to patch it (there seems to be a clear error in the logic
in that place which processes both tuples and non-tuples but somehow ignores
the fact of the latter and tries to cast to a Tuple anyway...
not sure why. Perhaps the fix is just to remove the cast. But this error
pops up second time today for me in different contexts..

Thanks.

-Dmitriy

On Mon, Jul 19, 2010 at 10:03 PM, Jeff Zhang <zjffdu@gmail.com> wrote:

> It looks like a bug of Pig.
> I try the following script:
>
> a = load 'data/a.txt' as (b:bag{t:tuple(f1:int,f2:int)});
> result = foreach a generate FLATTEN(b) as c;
> describe result;
>
> the output is
> result: {c: int,f2: int}
> The c is considered one field of tuple other than tuple
>
>
> On Tue, Jul 20, 2010 at 6:00 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
>
> > Hi,
> >
> > I would greatly appreciate somebody's help with the following pig error
> > during MR
> >
> > all mappers fail with the following stack trace
> >
> > java.lang.ClassCastException: java.lang.Integer cannot be cast to
> > org.apache.pig.data.Tuple
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POIsNull.getNext(POIsNull.java:152)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POAnd.getNext(POAnd.java:67)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLimit.getNext(POLimit.java:85)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:255)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:232)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227)
> >        at
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52)
> >        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> >        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >
> >
> >
> >
> >
> > the pig script fragment causing this is as follows :
> > IMP_F2 = foreach IMP_F1 generate ... , FLATTEN(contentRatings) as
> > contentRating;
> > IMP_F3 = filter IMP_F2 by contentRating is not null and
> > contentRating.vendorId==1
> >
> > if i remove IMP_F3 line then the job goes thru but adding IMP_F3
> > filtering causes this.
> > describe IMP_F2 produces
> >
> > IMP_F2: {... ,contentRating: (vendorId: int, ... ), ... }
> >
> >
> > i also tried casts like 'filter by ...
> > (int)(contentRating.vendorId)==1 which did not change anything.
> >
> > Any ideas for workaround are appreciated.
> >
> > Thanks in advance.
> > -Dmitriy
> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message