hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xu Zhang (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-146) More meaningful error message could be used when a non-existing column is accessed
Date Wed, 12 Mar 2008 02:20:46 GMT
More meaningful error message could be used when a non-existing column is accessed
----------------------------------------------------------------------------------

                 Key: PIG-146
                 URL: https://issues.apache.org/jira/browse/PIG-146
             Project: Pig
          Issue Type: Bug
            Reporter: Xu Zhang
            Priority: Minor


When accessing a non-existing column after getting the columns with a streaming command, I
got the following error which is not quite meaningfule:
{noformat}
[main] ERROR org.apache.pig.tools.grunt.Grunt -
{noformat}

Here is the sample Pig script that I used.  The data file has only 3 tab seperate fields so
the streaming command on line 3 generates tuples with 3 columns.  On line 4 the script tries
to access the 4th column which does not exist and thus the error above occurs.

{code}
A = load 'data;
B = foreach A generate $2, $1, $0;
C = stream B through `awk 'BEGIN {FS = "\t"; OFS = "\t"} {print $3, $2, $1}'`;
D = foreach C generate $4;
store D into 'results';
{code}

Here is what happens on my machine:

{code}
grunt> A = load 'data';
grunt> B = foreach A generate $2, $1, $0;
grunt> stream B through `awk 'BEGIN {FS = "\t"; OFS = "\t"} {print $3, $2, $1}'`;
grunt> D = foreach C generate $4;
2008-03-11 18:53:00,376 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
grunt>
{code}

A related note is that Pig behaves differently if there is no streaming command in the Pig
script.  In this case, an IndexOutOfBoundsException exception is generated at runtime.

{code}
grunt> A = load 'data';
grunt> B = foreach A generate $2, $1, $0;
grunt>  D = foreach B generate $4;                    
grunt> store D into 'results';
2008-03-11 18:57:40,107 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- ----- MapReduce Job -----
2008-03-11 18:57:40,108 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Input: [data:org.apache.pig.builtin.PigStorage()]
2008-03-11 18:57:40,109 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map: [[*]->GENERATE {[PROJECT $2],[PROJECT $1],[PROJECT $0]}->GENERATE {[PROJECT $4]}]
2008-03-11 18:57:40,109 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Group: null
2008-03-11 18:57:40,110 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Combine: null
2008-03-11 18:57:40,110 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce: null
2008-03-11 18:57:40,111 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Output: results:org.apache.pig.builtin.PigStorage
2008-03-11 18:57:40,111 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Split: null
2008-03-11 18:57:40,112 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map parallelism: -1
2008-03-11 18:57:40,112 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce parallelism: -1
2008-03-11 18:57:42,391 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
- Pig progress = 0%
2008-03-11 18:57:59,466 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
- Error message from task (map) tip_200802211201_1494_m_000000 java.lang.IndexOutOfBoundsException:
Requested index 4 from tuple (2.93, 21, rachel ovid)
        at org.apache.pig.data.Tuple.getField(Tuple.java:153)
        at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
        at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
        at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
        at org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
        at org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
        at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
        at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
        at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
 java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21, rachel ovid)
        at org.apache.pig.data.Tuple.getField(Tuple.java:153)
        at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
        at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
        at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
        at org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
        at org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
        at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
        at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
        at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
 java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21, rachel ovid)
        at org.apache.pig.data.Tuple.getField(Tuple.java:153)
        at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
        at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
        at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
        at org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
        at org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
        at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
        at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
        at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
 java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21, rachel ovid)
        at org.apache.pig.data.Tuple.getField(Tuple.java:153)
        at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
        at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
        at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
        at org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
        at org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
        at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
        at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
        at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
        at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)

2008-03-11 18:57:59,469 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
- Error message from task (reduce) tip_200802211201_1494_r_000000
2008-03-11 18:57:59,470 [main] ERROR org.apache.pig.tools.grunt.GruntParser - Unable to store
alias D
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message