pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-4057) Group All followed by CROSS with default parallelism produces wrong results
Date Fri, 25 Jul 2014 23:34:38 GMT

     [ https://issues.apache.org/jira/browse/PIG-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Daniel Dai updated PIG-4057:
----------------------------

    Attachment: PIG-4057-5.patch

Another patch addressing Rohini's review comments.

> Group All followed by CROSS with default parallelism produces wrong results
> ---------------------------------------------------------------------------
>
>                 Key: PIG-4057
>                 URL: https://issues.apache.org/jira/browse/PIG-4057
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4057-1.patch, PIG-4057-2.patch, PIG-4057-3.patch, PIG-4057-4.patch,
PIG-4057-5.patch
>
>
> SET default_parallel 199;
> ......
> by_size = ...
> uniq_vals = .....
> grpd = group uniq_vals all;
> all_vals = FOREACH grpd GENERATE uniq_vals;
> cross_result = CROSS by_size, all_vals;
> store cross_result into '/tmp/roh/cross/out/recipient_asns';
> Job1: grpd, all_vals, cross_result (The plan does GFCross function here for
> all_vals assuming cross parallelism to be 1 taking that of the current job even
> though it should consider default parallelism 199 of Job 2. Parallelism of Job1
> is 1 because of group all)
> Job2: cross_result (Actual CROSS of by_size and all_vals)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message