pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Dai <jiany...@yahoo-inc.com>
Subject Re: FOREACH and FLATTEN Syntax
Date Wed, 08 Dec 2010 00:43:21 GMT
When you flatten a bag, you get items inside the tuple. The foreach 
statement is wrong, you should change it to:
flat_foo = FOREACH foo GENERATE FLATTEN($0) as (f1, f2, f3, f4, f5);

DUMP flat_foo;
(a, b, c, d, e)
(1, 2, 3, 4, 5)
...
(f,g,h,i,j)
(6,7,8,9,10)

subset_foo = FOREACH flat_foo GENERATE f2, f4, f5;
DUMP subset_foo;

(b,d,e)
(2,4,5)
...
(g,i,j)
(7,9,10)

Daniel

Xavier Stevens wrote:
> I'm currently running into an issue where I have a bag of tuples like so:
>
>   
>> DUMP foo;
>>     
>  ( {(a,b,c,d,e), (1,2,3,4,5)}, ... , {(f,g,h,i,j), (6,7,8,9,10)} )
>
> Each one of the tuples has the same number of fields.  So I try to
> flatten the structure so I can get just the 1st, 3rd and 4th elements of
> each inner tuple.
>
>   
>> flat_foo = FOREACH foo GENERATE FLATTEN($0) AS (T: tuple(f1:chararray,
>>     
> f2:chararray, f3:chararray, f4:chararray, f5:chararray));
>   
>> DUMP flat_foo;
>>     
> (a, b, c, d, e)
> (1, 2, 3, 4, 5)
> ...
> (f,g,h,i,j)
> (6,7,8,9,10)
>
>   
>> subset_foo = FOREACH flat_foo GENERATE T.f2, T.f4, T.f5;
>> DUMP subset_foo;
>>     
>
> When I do this I end up getting a casting error "ERROR 2997: Unable to
> recreate exception from backed error: java.lang.ClassCastException:
> java.lang.String cannot be cast to org.apache.pig.data.Tuple".
>
>
> Anyone know what I am doing wrong here?
>
>
> Thanks,
>
>
> -Xavier
>   


Mime
View raw message