hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-302) Cross products of two flattens is incorrectly including records with null values
Date Thu, 10 Jul 2008 21:22:32 GMT

     [ https://issues.apache.org/jira/browse/PIG-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates updated PIG-302:
---------------------------

    Attachment: cogroup.patch

> Cross products of two flattens is incorrectly including records with null values
> --------------------------------------------------------------------------------
>
>                 Key: PIG-302
>                 URL: https://issues.apache.org/jira/browse/PIG-302
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>             Fix For: types_branch
>
>         Attachments: cogroup.patch
>
>
> Given two data sets:
> studenttab10
> alex garcia,39,3.81
> bob jones,40,2.77
> zach johnson,23,4.00
> tony mendleson,87,2.10
> todd wellington,55,3.32
> melany smith,19,3.98
> jane wesley,62,1.98
> irene chan,34,3.14
> laverne shirley,58,2.43
> marcia tently,32,3.48
> and
> alex garcia,39,republican,1.50
> bob jones,40,democrat,1000.30
> zach johnson,23,independent,0.00
> tony mendleson,87,socialist,101012.92
> todd wellington,55,green,99.89
> melany smith,29,republican,88787.29
> john wesley,62,democrat,0.89
> bob smith,18,independent,0.99
> johnny appleseed,234,green,99.95
> barak obama,47,democrat,3.48
> and the script:
> a = load '/Users/gates/test/data/studenttab10' using PigStorage(',') as (name, age, gpa);
> b = load '/Users/gates/test/data/votertab10' using PigStorage(',') as (name, age, registration,
contributions);
> c = filter a by age < 40;
> d = filter b by age < 40;
> e = cogroup c by name, d by name;
> f = foreach e generate flatten (c), flatten(d);
> dump f;
> The result is:
> (NULL, bob smith, 18, independent, 0.99)
> (alex garcia, 39, 3.81, alex garcia, 39, republican, 1.50)
> (melany smith, 19, 3.98, melany smith, 29, republican, 88787.29)
> (zach johnson, 23, 4.00, zach johnson, 23, independent, 0.00)
> The first record should not be there.  Flatten is supposed to remove records without
a match.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message