pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eyal Allweil (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2836) Namespace in Pig macros collides with Pig scripts
Date Sun, 26 Oct 2014 16:42:33 GMT

    [ https://issues.apache.org/jira/browse/PIG-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184534#comment-14184534

Eyal Allweil commented on PIG-2836:

Here is a script that demonstrates this problem. I've run this on Pig 0.11.1 and 0.10.0. You
can see the results to which I refer in the comments which represent the result of the "describe"

tagged = LOAD '/tagged.csv' USING PigStorage(',') AS (id: chararray, number: int);

describe tagged; -- tagged: {id: chararray,number: int}

define change_tagged(tagged_set, columnToKeep) returns counted
        tagged = FOREACH $tagged_set GENERATE $columnToKeep;

        grp_all = GROUP tagged ALL;

        $counted = FOREACH grp_all GENERATE COUNT(tagged);

results = change_tagged(tagged, id);

describe tagged; -- tagged: {id: chararray}

> Namespace in Pig macros collides with Pig scripts
> -------------------------------------------------
>                 Key: PIG-2836
>                 URL: https://issues.apache.org/jira/browse/PIG-2836
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt, parser
>    Affects Versions: 0.9.2, 0.10.0, 0.11, 0.10.1
>            Reporter: Russell Jurney
>            Assignee: Alan Gates
>            Priority: Critical
>              Labels: bacon, confit, goto, hash, macros, pig, sad
> Relation names in macros collide with relation names in the calling pig script. This
is my most common source of errors and it makes writing macros hard. Suggest that the macro
processor create a unique namespace for all relations in a macro other than $in and $out.
Prepend something to each relation name or somehow create a unique per-macro namespace.
> This may conflict with some uses of macros where relation names are passed through passively,
but this is always avoidable by supplying parameters and feels GOTO f*cked.

This message was sent by Atlassian JIRA

View raw message