hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Ding (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-1525) Incorrect data generated by diff of SUM
Date Thu, 29 Jul 2010 22:00:21 GMT
Incorrect data generated by diff of SUM

                 Key: PIG-1525
                 URL: https://issues.apache.org/jira/browse/PIG-1525
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.7.0
            Reporter: Richard Ding
            Assignee: Richard Ding
             Fix For: 0.8.0

Given data;


id9     0


id8     1
id9     1

Pig script

A = LOAD 'input1' AS (id:chararray, val:long);
B = LOAD 'input2' AS (id:chararray, val:long);
C = COGROUP A BY id, B BY id;
D = FOREACH C GENERATE group, SUM(B.val), SUM(A.val), (SUM(A.val) - SUM(B.val));
dump D;

generates incorrect data:


The workaround is to replace the FOREACH statement with

D = FOREACH C GENERATE group, SUM(B.val) as b, SUM(A.val) as a;
E = FOREACH D GENERATE $0, b, a, (a-b);

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message