pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <ga...@yahoo-inc.com>
Subject Re: Easy question...difference between this::form and this.form?
Date Mon, 06 Dec 2010 21:13:19 GMT
The reason it's needed is that ambiguities would result otherwise.

A = load 'foo' as (x, y, z);
B = load 'bar' as (w, x, y, z);
C = join A by x, B by x;
D = filter C by z > 0;  -- which z?

As long as the name is not ambiguous, the :: is not required.  So in  
the above example it would be perfectly legal to say

D = filter C by w > 0;

Out of curiosity, why do you want to remove the :: names?

Alan.

On Dec 6, 2010, at 1:05 PM, Jonathan Coveney wrote:

> Hijack away. I would be curious as to the reason we need this as well.
>
> 2010/12/6 Anze <anzenews@volja.net>
>
>>
>> Sorry to hijack your question, Jonathan, but while we are at it... :)
>>
>> Is there a way to tell Pig NOT to add "base_alias::"? Almost half  
>> my code
>> consists of FOREACH... GENERATE that just remove these prefixes.
>>
>> Thanks,
>>
>> Anze
>>
>> On Monday 06 December 2010, Daniel Dai wrote:
>>> After join, cross, foreach flatten, Pig will automatically add
>>> "base_alias::" prefix. All other cases use "."
>>>
>>> Daniel
>>>
>>> Jonathan Coveney wrote:
>>>> It's very hard to search for this among the docs because it's so
>> generic,
>>>> so I thought I'd ask... I'm sure the answer is painfully easy.
>>>>
>>>> Taking a look at this code that I found online, for example
>>>>
>>>> --
>>>> -- Read in a bag of tuples (timeseries for this example) and  
>>>> divide the
>>>> -- numeric column by its maximum.
>>>> --
>>>> %default DATABAG 'data/timeseries.tsv'
>>>>
>>>> data       = LOAD '$DATABAG' AS (month:chararray, count:int);
>>>> accumulate = GROUP data ALL;
>>>> calc_max   = FOREACH accumulate GENERATE FLATTEN(data),
>>>> MAX(data.count) AS max_count;
>>>> normalize  = FOREACH calc_max GENERATE data::month AS month,
>>>> data::count AS count, (float)data::count / (float)max_count AS
>>>> normed_count;
>>>> DUMP normalize;
>>>>
>>>> What purpose does data::month serve versus data.count?
>>>>
>>>> Thanks
>>
>>


Mime
View raw message