pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Viraj Bhat (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-1065) In-determinate behaviour of Union when there are 2 non-matching schema's
Date Fri, 30 Oct 2009 03:05:59 GMT
In-determinate behaviour of Union when there are 2 non-matching schema's

                 Key: PIG-1065
                 URL: https://issues.apache.org/jira/browse/PIG-1065
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.6.0
            Reporter: Viraj Bhat
             Fix For: 0.6.0

I have a script which first does a union of these schemas and then does a ORDER BY of this

f1 = LOAD '1.txt' as (key:chararray, v:chararray);
f2 = LOAD '2.txt' as (key:chararray);
u0 = UNION f1, f2;
describe u0;
dump u0;

u1 = ORDER u0 BY $0;
dump u1;

When I run in Map Reduce mode I get the following result:
$java -cp pig.jar:$HADOOP_HOME/conf org.apache.pig.Main broken.pig
Schema for u0 unknown.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for
alias u1
        at org.apache.pig.PigServer.openIterator(PigServer.java:475)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
        at org.apache.pig.Main.main(Main.java:397)
Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.pig.impl.io.NullableBytesWritable,
recieved org.apache.pig.impl.io.NullableText
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:415)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)

When I run the same script in local mode I get a different result, as we know that local mode
does not use any Hadoop Classes.
$java -cp pig.jar org.apache.pig.Main -x local broken.pig
Schema for u0 unknown

Here are some questions
1) Why do we allow union if the schemas do not match
2) Should we not print an error message/warning so that the user knows that this is not allowed
or he can get unexpected results?


This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message