pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Meagher (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-1767) Intermediate data lost in local mode
Date Tue, 14 Dec 2010 17:05:02 GMT
Intermediate data lost in local mode
------------------------------------

                 Key: PIG-1767
                 URL: https://issues.apache.org/jira/browse/PIG-1767
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.7.0, 0.8.0
         Environment: Windows 7/64bit running in Cygwin
            Reporter: John Meagher


When running the script below it fails because one of the intermediate data files is not available
one of the follow-on steps.  This fails under both 0.7.0 and the 0.8.0-rc.  This works fine
on 0.7.0 when run against HDFS.  


cat test.txt
a,1,2
b,1,3
c,1,4
d,1,5
e,1,6
f,1,7
g,1,8
h,1,9
i,1,10
j,1,11
k,1,12
l,1,13
a,1,100
y,10,20

profileTimeInfo  = LOAD 'test.txt' USING PigStorage(',') AS (id:chararray, created:long, timestamp:long);
timesById = GROUP profileTimeInfo BY id;
ageById = FOREACH timesById GENERATE group, (MAX(profileTimeInfo.timestamp) - MIN(profileTimeInfo.created))
AS age;
sortedAges = ORDER ageById BY age DESC;
topAges = LIMIT sortedAges 10;

DUMP timesById;  -- Succeeds
DUMP ageById;  -- Succeeds
DUMP sortedAges;  -- Fails, see exception below
DUMP topAges;  -- Fails, see exception below

Exception dumped in grunt:

2010-12-14 11:59:02,248 [Thread-72] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2010-12-14 11:59:02,251 [Thread-72] WARN  org.apache.hadoop.mapred.LocalJobRunner - job_local_0005
java.lang.RuntimeException: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
path does not exist: file:/C:/jmeagher/devel/sample_data/pigsample_21114123_1292345941552
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:139)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:527)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not
exist: file:/C:/jmeagher/devel/sample_data/pigsample_21114123_1292345941552
        at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
        at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
        at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:153)
        at org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:115)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangeP
artitioner.java:112)
        ... 6 more



Exception from the pig log file:

Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias sortedAges

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for
alias sortedAges
        at org.apache.pig.PigServer.openIterator(PigServer.java:754)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
        at org.apache.pig.Main.run(Main.java:465)
        at org.apache.pig.Main.main(Main.java:107)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
        at org.apache.pig.PigServer.openIterator(PigServer.java:744)
        ... 7 more
================================================================================
Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias topAges

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for
alias topAges
        at org.apache.pig.PigServer.openIterator(PigServer.java:754)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
        at org.apache.pig.Main.run(Main.java:465)
        at org.apache.pig.Main.main(Main.java:107)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
        at org.apache.pig.PigServer.openIterator(PigServer.java:744)
        ... 7 more
================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message