[ https://issues.apache.org/jira/browse/PIG-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659119#action_12659119
]
David Ciemiewicz commented on PIG-577:
--------------------------------------
Note that while this workaround does solve the schema problem, the resulting tuples in D will
not have sufficient null elements if either A or B is null. Especially if A is null.
Semantically, ït seems that the correct D statement should be something like:
D = FOREACH C GENERATE group, flatten((not IsEmpty(A) ? A: (null,null,null) ), flatten((not
IsEmpty(B) ? B: (null,null,null) ));
However, this generates all sorts of parse errors.
> outer join query looses name information
> ----------------------------------------
>
> Key: PIG-577
> URL: https://issues.apache.org/jira/browse/PIG-577
> Project: Pig
> Issue Type: Bug
> Affects Versions: types_branch
> Reporter: Olga Natkovich
> Fix For: types_branch
>
>
> The following query:
> A = LOAD 'student_data' AS (name: chararray, age: int, gpa: float);
> B = LOAD 'voter_data' AS (name: chararray, age: int, registration: chararray, contributions:
float);
> C = COGROUP A BY name, B BY name;
> D = FOREACH C GENERATE group, flatten((IsEmpty(A) ? null : A)), flatten((IsEmpty(B) ?
null : B));
> describe D;
> E = FOREACH D GENERATE A::gpa, B::contributions;
> Give the following error: (Even though describe shows correct information: D: {group:
chararray,A::name: chararray,A::age: int,A::gpa: float,B::name: chararray,B::age: int,B::registration:
chararray,B::contributions: float}
> java.io.IOException: Invalid alias: A::gpa in {group: chararray,bytearray,bytearray}
> at org.apache.pig.PigServer.parseQuery(PigServer.java:298)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:263)
> at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:439)
> at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:249)
> at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
> at org.apache.pig.Main.main(Main.java:306)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid alias: A::gpa
in {group: chararray,bytearray,bytearray}
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5930)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5788)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:3974)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3871)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3825)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3734)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3660)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3626)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3552)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3462)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3419)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2894)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2309)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:966)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:742)
> at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:537)
> at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
> at org.apache.pig.PigServer.parseQuery(PigServer.java:295)
> ... 6 more
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
|