hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ankur (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-428) TypeCastInserter does not replace projects in inner plans correctly
Date Wed, 14 Jan 2009 05:28:59 GMT

     [ https://issues.apache.org/jira/browse/PIG-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ankur updated PIG-428:
----------------------


I am still seeing this issue. What I do is load my data using a custom loader. One of the
fields returned by loader is of type Map.
When I retrieve a value from the map and group on that, I get this exception. Here is a snippet
of  my script.

raw =  LOAD '/mydata/*' USING MyLoader() ;
entry = FILTER raw BY (CUSTOMARGMAP#'keyOfInterest' is not null);
listing = FOREACH entry GENERATE CUSTOMARGMAP#'keyOfInterest' as keyGroup;
myGroup = GROUP listing BY (keyGroup);
unordered_results = FOREACH myGroup GENERATE group, COUNT(*);
results = ORDER unordered_results by $1 DESC;
STORE results INTO 'Results' USING PigStorage();

 



> TypeCastInserter does not replace projects in inner plans correctly
> -------------------------------------------------------------------
>
>                 Key: PIG-428
>                 URL: https://issues.apache.org/jira/browse/PIG-428
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>             Fix For: types_branch
>
>         Attachments: PIG-428.patch
>
>
> The TypeCastInserter tries to replace the Project's input operator in inner plans with
the new foreach operator it adds. However it should replace only those Projects' input where
the new Foreach has been added after the operator which was earlier the input to Project.
> Here is a query which fails due to this:
> {code}
> a = load 'st10k' as (name:chararray,age:int, gpa:double);
> another = load 'st10k';
> c = foreach another generate $0, $1+ 10, $2 + 10;
> d = join a by $0, c by $0;
> dump d;
> {code}
> Here is the error:
> {noformat}
> 2008-09-11 23:34:28,169 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher
- Error message from task (map) tip_200809051428_0045_m_000000java.io.IOException: Type mismatch
in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableBytesWritable
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:419)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:83)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:172)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:75)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message