hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pradeep Kamath (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-634) When POUnion is one of the roots of a map plan, POUnion.getNext() gives a null pointer exception
Date Mon, 26 Jan 2009 18:13:59 GMT

     [ https://issues.apache.org/jira/browse/PIG-634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pradeep Kamath updated PIG-634:
-------------------------------

    Status: Patch Available  (was: Open)

Attached patch which fixed POUnion.getNext() by adding the following condition:
{code}
public Result getNext(Tuple t) throws ExecException {

        if (nextReturnEOP) {
            nextReturnEOP = false ;
            return eopResult ;
        }

        // Case 1 : Normal connected plan
        if (!isInputAttached()) {
            
            if (inputs == null || inputs.size()==0) {
                // Neither does this Union have predecessors nor
                // was any input attached! This can happen when we have
                // a plan like below
                // POUnion
                // |
                // |--POLocalRearrange
                // |    |
                // |    |-POUnion (root 2)--> This union's getNext() can lead the code
here
                // |
                // |--POLocalRearrange (root 1)
                
                // The inner POUnion above is a root in the plan which has 2 roots.
                // So these 2 roots would have input coming from different input
                // sources (dfs files). So certain maps would be working on input only
                // meant for "root 1" above and some maps would work on input
                // meant only for "root 2". In the former case, "root 2" would
                // neither get input attached to it nor does it have predecessors
                // which is the case which can lead us here.
                return eopResult;
            }
            ... rest of getNext
{code}

The check to see if inputs is null or inputs.size() is 0 is the new condition added in getNext().
This ensures that when POUnion is one of the roots of the map plan and when it receives no
input (i.e. no input is attached), it will send EOP to successor.

> When POUnion is one of the roots of a map plan, POUnion.getNext() gives a null pointer
exception
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-634
>                 URL: https://issues.apache.org/jira/browse/PIG-634
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: types_branch
>
>         Attachments: PIG-634.patch
>
>
> POUnion.getnext() gives a null pointer exception in the following scenario (pasted from
a code comment explaining the fix for this issue). If a script results in a plan like the
one below, currently POUnion.getNext() gives a null pointer exception
> {noformat}
>                 
>                 // POUnion
>                 // |
>                 // |--POLocalRearrange
>                 // |    |
>                 // |    |-POUnion (root 2)--> This union's getNext() can lead the
code here
>                 // |
>                 // |--POLocalRearrange (root 1)
>                 
>                 // The inner POUnion above is a root in the plan which has 2 roots.
>                 // So these 2 roots would have input coming from different input
>                 // sources (dfs files). So certain maps would be working on input only
>                 // meant for "root 1" above and some maps would work on input
>                 // meant only for "root 2". In the former case, "root 2" would
>                 // neither get input attached to it nor does it have predecessors
> {noformat}
> A script which can cause a plan like above is:
> {code}
> a = load 'xyz'; 
> b = load 'abc'; 
> c = union a,b; 
> d = load 'def'; 
> e = cogroup c by $0 inner , d by $0 inner;
> dump e;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message