pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (PIG-435) wrong columns produced if incomplete definition provided during load
Date Wed, 02 Mar 2011 21:17:37 GMT

     [ https://issues.apache.org/jira/browse/PIG-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Olga Natkovich resolved PIG-435.

    Resolution: Duplicate

This issue will be solved as part of the fix to  https://issues.apache.org/jira/browse/PIG-1188

> wrong columns produced if incomplete definition provided during load
> --------------------------------------------------------------------
>                 Key: PIG-435
>                 URL: https://issues.apache.org/jira/browse/PIG-435
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.0
>            Reporter: Olga Natkovich
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.9.0
> Scrip:
> A = load 'studenttab10k' as (name); -- note that data has more than 1 column
> B = load 'votertab10k' as (name, age, reg, contrib);
> D = COGROUP A by name, B by name;  
> E = foreach D generate flatten(A), flatten(B); 
> F = foreach E generate registration, contr;
> dump F;
> The dump produces the wrong columns. This is because even though we declared only one
column, we actually load all columns of A. So any place where we explicitely or implicitely
use A.* as the case in flatten, we would produce the wrong results.
> The long term solution is actually to push projections into the load. Shorter term the
proposal is to notice if the script uses A.* and stick a project after the load. Note that
we don't need to do that if types are declared because there will be already casting foreach

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message