pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-3826) Outer join with PushDownForEachFlatten generates wrong result
Date Fri, 21 Mar 2014 03:59:44 GMT

     [ https://issues.apache.org/jira/browse/PIG-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Daniel Dai updated PIG-3826:
----------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Patch committed to both trunk and 0.12 branch. Thanks Rohini for review!

> Outer join with PushDownForEachFlatten generates wrong result
> -------------------------------------------------------------
>
>                 Key: PIG-3826
>                 URL: https://issues.apache.org/jira/browse/PIG-3826
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.12.1, 0.13.0
>
>         Attachments: PIG-3826-1.patch
>
>
> The following script generates wrong result:
> A = load 'A.txt' using PigStorage(',') as (id:chararray, value:double);
> B = load 'B.txt' using PigStorage(',') as (id:chararray, name:chararray);
> t1 = group A by id;
> t2 = foreach t1 { r1 = filter $1 by (value>1); r2 = limit r1 1; generate group as
id, FLATTEN(r2.value) as value; };
> t3 = join B by id LEFT OUTER, t2 by id;
> dump t3;
> A.txt:
> 1,1.5
> 2,0
> 3,-2.0
> 4,8.9
> B.txt:
> 1,Ofer
> 2,Jordan
> 3,Noa
> 4,Daniel
> Expected output:
> (1,Ofer,1,1.5)
> (2,Jordan,,)
> (3,Noa,,)
> (4,Daniel,4,8.9)
> But we get:
> (1,Ofer,1,1.5)
> (4,Daniel,4,8.9)
> With the option "-t PushDownForEachFlatten", the issue goes away.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message