hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (PIG-616) Casts to complex types do not work as expected
Date Mon, 10 May 2010 20:26:30 GMT

     [ https://issues.apache.org/jira/browse/PIG-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Olga Natkovich resolved PIG-616.
--------------------------------

    Resolution: Fixed

Thanks, Gianmarco for testing

> Casts to complex types do not work as expected
> ----------------------------------------------
>
>                 Key: PIG-616
>                 URL: https://issues.apache.org/jira/browse/PIG-616
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.2.0
>            Reporter: Santhosh Srinivasan
>            Assignee: Alan Gates
>
> When we specify a (complex) type as a column in Pig, the TypeCastInserter inserts the
appropriate cast for the (complex) type. However, in the implementation of POCast.java, when
databyte arrays are converted to the (complex) types, we invoke the bytesToXXX method. 
> For complex types, especially tuples and bags, we do not enforce the typing information
specified by the user in the AS clause or with the explicit cast statement. The implementation
solely relies on bytesToXXX to figure out the right type.
> An example of a query that fails is given below. Wrt the query, the data is a single
column that is a bag of integers. The user specifies this bag to be a bag of chararray. This
conversion is allowed in pig but the implementation does not perform the actual cast. Instead
the bytesToBag is called on the stream. The resulting type is a bag of integers and not a
bag of chararray. In the subsequent statement the user (correctly) assumes that the conversion
has been performed but in reality it has not been done. At run time when a chararray based
operation is performed we see a ClassCastException.
> The notion of a schema has is absent in the physical operators. The schema/fieldSchema
in the logical layer has to be passed on to the physical layer. The schema can be used to
perform additional operations like casting, etc.
> {code}
> grunt> cat bag.data
> {(1)}
> grunt> a = load 'bag.data' as (b:{t:(c:chararray)});
> grunt> b = foreach a generate flatten(b);
> grunt> c = foreach b generate CONCAT('Hello ', $0);
> grunt> dump c;
> 2009-01-12 10:44:44,417 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
> 2009-01-12 10:45:09,439 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Map reduce job failed
> 2009-01-12 10:45:09,440 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Job failed!
> 2009-01-12 10:45:09,443 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher
- Error message from task (map) task_200812151518_9681_m_000000java.lang.ClassCastException:
java.lang.Integer cannot be cast to java.lang.String
>         at org.apache.pig.builtin.StringConcat.exec(StringConcat.java:37)
>         at org.apache.pig.builtin.StringConcat.exec(StringConcat.java:31)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:185)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:259)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:271)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:197)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:187)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:175)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
> ...
> 2009-01-12 10:45:09,448 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable
to open iterator for alias c
> 2009-01-12 10:45:09,448 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException:
Unable to open iterator for alias c
>         at org.apache.pig.PigServer.openIterator(PigServer.java:426)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:271)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84)
>         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:72)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: java.io.IOException: Job terminated with anomalous status FAILED
>         at org.apache.pig.PigServer.openIterator(PigServer.java:420)
>         ... 5 more
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message