hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pradeep Kamath (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-508) Query with a cogroup have one of its inputs coming from a group fails
Date Fri, 24 Oct 2008 19:23:46 GMT

     [ https://issues.apache.org/jira/browse/PIG-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pradeep Kamath updated PIG-508:
-------------------------------

    Attachment: PIG-508.patch

Attached patch - details on the fix:
POPackageAnnotator is a visitor which looks for "POPackage" and annotates it with "keyInfo"
from each of the LocalRearranges which provide input to the POPackage. The keyinfo essentially
has information about what part of the "value" for a given input is present in the "key" and
hence ommitted from the "value". The visitor was incorrectly assuming that if a local rearrange
corresponding to the package is found in the given MROper's map plan, then the annotation
is done. This breaks in the case of the script in this issue - the POPackage has one of its
Local rearranges in the map plan of the same MROper as the POPackage and the other local rearrange
in the reduce plan of the predecessor MROper. Hence the visitor was changed to ensure that
POPackage is annotated with information from *all* Local rearranges.

> Query with a cogroup have one of its inputs coming from a group fails
> ---------------------------------------------------------------------
>
>                 Key: PIG-508
>                 URL: https://issues.apache.org/jira/browse/PIG-508
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: types_branch
>
>         Attachments: PIG-508.patch
>
>
> Script which fails:
> {code}
> a = load '/user/pig/tests/data/singlefile/studenttab10k';
> b = group a by $0;
> c = load '/user/pig/tests/data/singlefile/studenttab10k';
> d = cogroup b by $0, c by $0;
> e = foreach d generate group, c.$1, SUM(c.$1), COUNT(c);
> dump e;
> {code}
> Error message produced:
> {noformat}
> 08/10/23 15:23:54 ERROR mapReduceLayer.MapReduceLauncher: Job failed! 
> 08/10/23 15:23:54 ERROR mapReduceLayer.Launcher: Error message from task (reduce) task_200810231521_0007_r_000000java.lang.NullPointerException
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:218)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:208)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:134)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message