hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (PIG-772) Semantics of Filter statement inside ForEach should support filtering on aliases used in the Group statement preceding it
Date Wed, 22 Sep 2010 00:42:34 GMT

     [ https://issues.apache.org/jira/browse/PIG-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates resolved PIG-772.
----------------------------

    Resolution: Invalid

The error message here is bad, but this is an error.  You are trying to secretly do a join
in the filter line by referencing two relations (N and A).  Pig does not allow a filter operator
to have multiple inputs.


> Semantics of Filter statement inside ForEach should support filtering on aliases used
in the Group statement preceding it
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-772
>                 URL: https://issues.apache.org/jira/browse/PIG-772
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.3.0
>            Reporter: Viraj Bhat
>            Assignee: Alan Gates
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: half.txt
>
>
> I have  a Pig script which tries to display all bags which are greater than the average
value in the group.
> Input: half.txt
> ===================
> A       1
> A       2
> A       3
> B       1
> B       3
> ====================
> {code}
> A = LOAD 'half.txt' AS (key:CHARARRAY, val:INT);
> B = GROUP A BY key;
> C = FOREACH B {
>        N = AVG(A.val);
>        HALF = FILTER A by val >= N;
>     GENERATE
>        FLATTEN(GROUP),
>        HALF;
> };
> dump C;
> {code}
> ====================
> Expected Output:
> ====================
> (A,{(A,2),(A,3)})
> (B,{(B,3)})
> ====================
> Presently the semantics of the Filter statement inside the FOREACH does not support these
types of operations.
> Error when running the above script.
> =========================================================================================
> ERROR 1000: Error during parsing. Invalid alias: A in {key: chararray,val: int}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing.
Invalid alias: A in {key: chararray,val: int}
>         at org.apache.pig.PigServer.parseQuery(PigServer.java:320)
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:279)
>         at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:529)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:280)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:99)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>         at org.apache.pig.Main.main(Main.java:364)
> =========================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message