hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-1146) Inconsistent column pruning in LOUnion
Date Wed, 23 Dec 2009 19:52:29 GMT

     [ https://issues.apache.org/jira/browse/PIG-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Daniel Dai updated PIG-1146:
----------------------------

    Attachment: PIG-1146-2.patch

Address couple of suggestions from Pradeep in ColumnPruner:
1. Clear some code for LOUnion handling
2. Remove the code to merge cached "pruned columns" structure for each logical operator
3. Simplify the logic which require all relevant fields to be pruned before pruning

> Inconsistent column pruning in LOUnion
> --------------------------------------
>
>                 Key: PIG-1146
>                 URL: https://issues.apache.org/jira/browse/PIG-1146
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.7.0
>
>         Attachments: PIG-1146-1.patch, PIG-1146-2.patch
>
>
> This happens when we do a union on two relations, if one column comes from a loader,
the other matching column comes from a constant, and this column get pruned. We prune for
the one from loader and did not prune the constant. Thus leaves union an inconsistent state.
Here is a script:
> {code}
> a = load '1.txt' as (a0, a1:chararray, a2);
> b = load '2.txt' as (b0, b2);
> c = foreach b generate b0, 'hello', b2;
> d = union a, c;
> e = foreach d generate $0, $2;
> dump e;
> {code}
> 1.txt: 
> {code}
> ulysses thompson        64      1.90
> katie carson    25      3.65
> {code}
> 2.txt:
> {code}
> luke king       0.73
> holly davidson  2.43
> {code}
> expected output:
> (ulysses thompson,1.90)
> (katie carson,3.65)
> (luke king,0.73)
> (holly davidson,2.43)
> real output:
> (ulysses thompson,)
> (katie carson,)
> (luke king,0.73)
> (holly davidson,2.43)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message