hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Santhosh Srinivasan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-379) describe interfiers with name resolution
Date Wed, 20 Aug 2008 12:24:47 GMT

    [ https://issues.apache.org/jira/browse/PIG-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623972#action_12623972
] 

Santhosh Srinivasan commented on PIG-379:
-----------------------------------------

The describe statement kicks of the logical plan -> type checker -> optimizer process.
During the logical plan optimization the schema of each operator is reset. When the schema
of each operator is recomputed, the computation uses the attributes of the operator along
with the information about its inputs. User defined schemas specified with the as clause are
not annotated as such in each operator. As a result, when the schema is reset in the logical
optimizer, this information is lost resulting in incorrect schemas.

There are multiple items that we need to consider:

1. Annotate each relational operator and expression operator with an attributed to denote
presence of user specified schemas

2. Checks to ensure compatibility of user specified schemas with the generated/inferred schemas,
i.e.,
   a. if the user specifies incorrect types, then perform appropriate checks and type promotions
   b. if the schema is a mismatch then flag it as an error

3. For complex constants, the schema computation is a bit complex and involves type promotions,
null introductions, etc.

> describe interfiers with name resolution
> ----------------------------------------
>
>                 Key: PIG-379
>                 URL: https://issues.apache.org/jira/browse/PIG-379
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>            Priority: Critical
>             Fix For: types_branch
>
>
> If I ran the following script:
> A = load 'studenttab10k' as (name: chararray, age: int, gpa: float);
> B = foreach A generate name, age;
> describe B;
> C = filter B by age > 30;
> describe C;
> D = group C by name;
> describe D;
> I get the error below. Also notice that the schema of C no longer have names:
> {name: chararray,age: integer}
> {chararray,integer}
> java.io.IOException: Invalid alias: name in {chararray,integer}
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:254)
>         at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:422)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid alias: name
in {chararray,integer}
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5179)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5048)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:3357)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3254)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3208)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3117)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3043)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3009)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:2911)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.GroupItem(QueryParser.java:1548)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.CogroupClause(QueryParser.java:1468)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:751)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:569)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:378)
>         at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:251)
> If I remove describe, I don't see any errors

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message