hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Ding (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-973) type resolution inconsistency
Date Wed, 09 Dec 2009 22:43:18 GMT

    [ https://issues.apache.org/jira/browse/PIG-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788366#action_12788366
] 

Richard Ding commented on PIG-973:
----------------------------------


This following script also produces an error:

{code}
A = load 'input' as (id:int, g:bag{t:tuple(u:int)});;
B = foreach A generate id, SUM(g); ;
dump B;
{code}

The problem appears to be with the patch of PIG-315, which only pushed the exception from
compile-time to runtime for the script below:

{code}
a = load 'studenttab10k' as (name:chararray, age:int, gpa:double);
b = foreach a generate (long)age as age, (int)gpa as gpa;
c = foreach b generate SUM(age), SUM(gpa);
dump c; 
{code}

Actually, one can argue that this is an invalid script -- the input type of the eval function
SUM is a bag of numbers, not a number. So it should be caught at compile-time (by the parser
if posible).

> type resolution inconsistency
> -----------------------------
>
>                 Key: PIG-973
>                 URL: https://issues.apache.org/jira/browse/PIG-973
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>            Assignee: Richard Ding
>
> This script works:
> A = load 'test' using PigStorage(':') as (name: chararray, age: int, gpa: float);
> B = group A by age;
> C = foreach B {
>    D = filter A by gpa > 2.5;
>    E = order A by name;
>    F = A.age;
>    describe F;
>    G = distinct F;
>    generate group, COUNT(D), MAX (E.name), MIN(G.$0);}
> dump C;
> This one produces an error:
> A = load 'test' using PigStorage(':') as (name: chararray, age: int, gpa: float);
> B = group A by age;
> C = foreach B {
>    D = filter A by gpa > 2.5;
>    E = order A by name;
>    F = A.age;
>    G = distinct F;
>    generate group, COUNT(D), MAX (E.name), MIN(G);}
> dump C;
> Notice the difference in how MIN is passed the data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message