hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1272) Column pruner causes wrong results
Date Wed, 03 Mar 2010 06:57:27 GMT

    [ https://issues.apache.org/jira/browse/PIG-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840496#action_12840496
] 

Hadoop QA commented on PIG-1272:
--------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12437666/PIG-1272-1.patch
  against trunk revision 917827.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit
warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/220/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/220/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/220/console

This message is automatically generated.

> Column pruner causes wrong results
> ----------------------------------
>
>                 Key: PIG-1272
>                 URL: https://issues.apache.org/jira/browse/PIG-1272
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.7.0
>
>         Attachments: PIG-1272-1.patch
>
>
> For a simple script the column pruner optimization removes certain columns from the original
relation, which results in wrong results.
> Input file "kv" contains the following columns (tab separated)
> {code}
> a       1
> a       2
> a       3
> b       4
> c       5
> c       6
> b       7
> d       8
> {code}
> Now running this script in Pig 0.6 produces
> {code}
> kv = load 'kv' as (k,v);
> keys= foreach kv generate k;
> keys = distinct keys; 
> keys = limit keys 2;
> rejoin = join keys by k, kv by k;
> dump rejoin;
> {code}
> (a,a)
> (a,a)
> (a,a)
> (b,b)
> (b,b)
> Running this in Pig 0.5 version without column pruner results in:
> (a,a,1)
> (a,a,2)
> (a,a,3)
> (b,b,4)
> (b,b,7)
> When we disable the "ColumnPruner" optimization it gives right results.
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message