hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Ciemiewicz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-577) outer join query looses name information
Date Wed, 24 Dec 2008 17:12:44 GMT

    [ https://issues.apache.org/jira/browse/PIG-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659119#action_12659119
] 

David Ciemiewicz commented on PIG-577:
--------------------------------------

Note that while this workaround does solve the schema problem, the resulting tuples in D will
not have sufficient null elements if either A or B is null.  Especially if A is null.

Semantically, ït seems that the correct D statement should be something like:

D = FOREACH C GENERATE group, flatten((not IsEmpty(A) ? A: (null,null,null) ), flatten((not
IsEmpty(B) ? B: (null,null,null) ));

However, this generates all sorts of parse errors.

> outer join query looses name information
> ----------------------------------------
>
>                 Key: PIG-577
>                 URL: https://issues.apache.org/jira/browse/PIG-577
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>             Fix For: types_branch
>
>
> The following query:
> A = LOAD 'student_data' AS (name: chararray, age: int, gpa: float);
> B = LOAD 'voter_data' AS (name: chararray, age: int, registration: chararray, contributions:
float);
> C = COGROUP A BY name, B BY name;
> D = FOREACH C GENERATE group, flatten((IsEmpty(A) ? null : A)), flatten((IsEmpty(B) ?
null : B));
> describe D;
> E = FOREACH D GENERATE A::gpa, B::contributions;
> Give the following error: (Even though describe shows correct information: D: {group:
chararray,A::name: chararray,A::age: int,A::gpa: float,B::name: chararray,B::age: int,B::registration:
chararray,B::contributions: float}
> java.io.IOException: Invalid alias: A::gpa in {group: chararray,bytearray,bytearray}
>         at org.apache.pig.PigServer.parseQuery(PigServer.java:298)
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:263)
>         at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:439)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:249)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:306)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid alias: A::gpa
in {group: chararray,bytearray,bytearray}
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5930)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5788)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:3974)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3871)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3825)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3734)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3660)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3626)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3552)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3462)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3419)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2894)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2309)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:966)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:742)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:537)
>         at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
>         at org.apache.pig.PigServer.parseQuery(PigServer.java:295)
>         ... 6 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message