pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-4737) Check and fix clone implementation for all classes extending PhysicalOperator
Date Tue, 12 Jan 2016 18:21:39 GMT

     [ https://issues.apache.org/jira/browse/PIG-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Rohini Palaniswamy updated PIG-4737:
    Attachment: PIG-4737-1.patch

> Check and fix clone implementation for all classes extending PhysicalOperator
> -----------------------------------------------------------------------------
>                 Key: PIG-4737
>                 URL: https://issues.apache.org/jira/browse/PIG-4737
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.16.0
>         Attachments: PIG-4737-1.patch
>     PhysicalOperator.clone() eventually calls Object.clone() which only does a shallow
copy (javadoc wrongly says deep copy) and this causes issues with UnionOptimizer in Tez. Most
of the clone is already fixed due to issues found earlier, but recently ran into an issue
with POStream where after clone same reference was retained to binaryOutputQueue and binaryInputQueue
and caused the script to hang. 
> Mostly cloned operators in Union go to different tez vertex plans and the issue would
not have occurred, but in the particular case due to replicated join and with the combination
of multi-query and union optimization, both the cloned plans of union ended up in the same
vertex(one that loads C). That single vertex will handle both the replicated joins and streaming
in two sub-plans of split and store the final result in g.
> {code}
> A = LOAD 'a';
> B = LOAD 'b';
> C = LOAD 'c';
> D = JOIN C by $0, A by $0 using 'replicated';
> E = JOIN C by $0, B by $0 using 'replicated';
> F = UNION D, E;
> G = STREAM F through ....
> STORE G into 'g';
> {code}
> It is good to go through all classes extending PhysicalOperator and check if it deep
clones objects that are not primitive types.

This message was sent by Atlassian JIRA

View raw message