pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Dai <jiany...@yahoo-inc.com>
Subject Re: should the following query work?
Date Fri, 10 Dec 2010 01:16:21 GMT
You can slice a bag, but not a bag of bag. If you do want to project x, 
do it early:

A = load 'foo.txt' using PigStorage as (x : chararray, y : int);
B = group A by x;
B1 = foreach B generate group, A.x as Ax;
C = group B1 by group;
E = foreach C generate B1.(group, Ax);

Daniel

Kris Coward wrote:
> Dare I ask why such a query would be used? AFAICT the second group
> operation would just stick each record in a bag and create an extra
> copy of group on the outside of the bag (but use up a lot more
> computational power than a UDF that would just do the same thing
> explicitly).
>
> Cheers,
> Kris
>
> On Thu, Dec 09, 2010 at 03:34:58PM -0800, Lin Guo wrote:
>   
>> A = load 'foo.txt' using PigStorage as (x : chararray, y : int);
>>
>> B = group A by x;
>> C = group B by group;
>> describe C;
>>
>> -- we got
>> -- C: {group: chararray,B: {group: chararray,A: {x: chararray,y: int}}}
>>
>> D = foreach C generate B.(group, A);  -- this works
>> describe D;
>>
>> E = foreach C generate B.(group, A.(x));
>> describe E;
>> --- pig returns syntax error, but should this work? Or is there a patch for it?
>>
>> thanks,
>> lin
>>     
>
>   


Mime
View raw message