hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Assigned: (PIG-1281) Detect org.apache.pig.data.DataByteArray cannot be cast to org.apache.pig.data.Tuple type of errors at Compile Type during creation of logical plan
Date Mon, 13 Sep 2010 22:51:37 GMT

     [ https://issues.apache.org/jira/browse/PIG-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates reassigned PIG-1281:
-------------------------------

    Assignee: Alan Gates

> Detect org.apache.pig.data.DataByteArray cannot be cast to org.apache.pig.data.Tuple
type of errors at Compile Type during creation of logical plan
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1281
>                 URL: https://issues.apache.org/jira/browse/PIG-1281
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Alan Gates
>             Fix For: 0.9.0
>
>
> This is more of an enhancement request, where we can detect simple errors during compile
time during creation of Logical plan rather than at the backend.
> I created a script which contains an error which gets detected in the backend as a cast
error when in fact we can detect it in the front end(group is a single element so group.$0
projection operation will not work).
> {code}
> inputdata = LOAD '/user/viraj/mymapdata' AS (co1, col2, col3, col4);
> projdata = FILTER inputdata BY (col1 is not null);
> groupprojdata = GROUP projdata BY col1;
> cleandata = FOREACH groupprojdata {
>                      bagproj = projdata.col1;
>                      dist_bags = DISTINCT bagproj;
>                      GENERATE group.$0 as newcol1, COUNT(dist_bags) as newcol2;
>                       };
> cleandata1 = GROUP cleandata by newcol2;
> cleandata2 = FOREACH cleandata1 { GENERATE group.$0 as finalcol1, COUNT(cleandata.newcol1)
as finalcol2; };
> ordereddata = ORDER cleandata2 by finalcol2;
> store into 'finalresult' using PigStorage();
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message