pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-302) Cross products of two flattens is incorrectly including records with null values
Date Thu, 10 Jul 2008 19:20:31 GMT
Cross products of two flattens is incorrectly including records with null values
--------------------------------------------------------------------------------

                 Key: PIG-302
                 URL: https://issues.apache.org/jira/browse/PIG-302
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: types_branch
            Reporter: Alan Gates
            Assignee: Alan Gates


Given two data sets:

studenttab10
alex garcia,39,3.81
bob jones,40,2.77
zach johnson,23,4.00
tony mendleson,87,2.10
todd wellington,55,3.32
melany smith,19,3.98
jane wesley,62,1.98
irene chan,34,3.14
laverne shirley,58,2.43
marcia tently,32,3.48

and

alex garcia,39,republican,1.50
bob jones,40,democrat,1000.30
zach johnson,23,independent,0.00
tony mendleson,87,socialist,101012.92
todd wellington,55,green,99.89
melany smith,29,republican,88787.29
john wesley,62,democrat,0.89
bob smith,18,independent,0.99
johnny appleseed,234,green,99.95
barak obama,47,democrat,3.48

and the script:

a = load '/Users/gates/test/data/studenttab10' using PigStorage(',') as (name, age, gpa);
b = load '/Users/gates/test/data/votertab10' using PigStorage(',') as (name, age, registration,
contributions);
c = filter a by age < 40;
d = filter b by age < 40;
e = cogroup c by name, d by name;
f = foreach e generate flatten (c), flatten(d);
dump f;

The result is:

(NULL, bob smith, 18, independent, 0.99)
(alex garcia, 39, 3.81, alex garcia, 39, republican, 1.50)
(melany smith, 19, 3.98, melany smith, 29, republican, 88787.29)
(zach johnson, 23, 4.00, zach johnson, 23, independent, 0.00)

The first record should not be there.  Flatten is supposed to remove records without a match.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message