hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Viraj Bhat (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-1272) Column pruner causes wrong results
Date Tue, 02 Mar 2010 22:26:27 GMT
Column pruner causes wrong results

                 Key: PIG-1272
                 URL: https://issues.apache.org/jira/browse/PIG-1272
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.6.0
            Reporter: Viraj Bhat
             Fix For: 0.7.0

For a simple script the column pruner optimization removes certain columns from the original
relation, which results in wrong results.

Input file "kv" contains the following columns (tab separated)
a       1
a       2
a       3
b       4
c       5
c       6
b       7
d       8

Now running this script in Pig 0.6 produces

kv = load 'kv' as (k,v);
keys= foreach kv generate k;
keys = distinct keys; 
keys = limit keys 2;
rejoin = join keys by k, kv by k;
dump rejoin;


Running this in Pig 0.5 version without column pruner results in:

When we disable the "ColumnPruner" optimization it gives right results.


This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message