hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-454) group by followed by group ALL causes error in reduce
Date Thu, 25 Sep 2008 00:19:44 GMT

     [ https://issues.apache.org/jira/browse/PIG-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates updated PIG-454:
---------------------------

    Attachment: PIG-454.patch

CombinerOptimizer is a visitor that walks the entire plan of MapReduceOpers.  It was not resetting
state as it visited each operator, causing it to get confused on the key to set in the combiner
in cases where there were multiple ops that could use the combiner.

> group by followed by group ALL causes error in reduce
> -----------------------------------------------------
>
>                 Key: PIG-454
>                 URL: https://issues.apache.org/jira/browse/PIG-454
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Alan Gates
>             Fix For: types_branch
>
>         Attachments: PIG-454.patch
>
>
> Script:
> {code}
> a = load 'st10k' as (name, age, gpa);
> b = group a by name;
> c = foreach b generate flatten(group), COUNT(a) as cnt;
> d = group c all;
> e = foreach d generate AVG(c.cnt);
> dump e;
> {code}
> Error:
> {noformat}
> 2008-09-23 17:58:12,002 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Job failed!
> 2008-09-23 17:58:12,004 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher
- Error message from task (map) tip_200809051428_0117_m_000000java.io.IOException: wrong key
class: org.apache.pig.impl.io.NullableTuple is not class org.apache.pig.impl.io.NullableText
>         at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:995)
>         at org.apache.hadoop.mapred.MapTask$CombineOutputCollector.collect(MapTask.java:1079)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:155)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:56)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:872)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:779)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:691)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:220)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
> ...
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message