crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Beech <d...@paraliatech.com>
Subject Misleading error message
Date Fri, 28 Jun 2013 15:07:44 GMT
Hi all,

Please take a look at the following pipeline:

read(From.textFile(args[0])).write(To.textFile(args[1] + "-text"));
run();
read(From.textFile(args[0])).write(To.sequenceFile(args[1] + "-seq"));
run();
read(From.textFile(args[0])).write(To.avroFile(args[1] + "-avro"));
done();

The first two jobs are fine, and give correct output types of text and
sequence files respectively. The text to avro conversion fails. This is no
great surprise, knowing a little about the internals of Crunch, but when
put alongside the other examples it feels like it should work.

Even if it can't work - no big deal, it's just a toy example. The main
problem for me was the error message:

13/06/28 14:11:40 INFO jobcontrol.CrunchControlledJob:
org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.
at
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)

I think the job should have been killed somewhere before this point. There
must be a bit of logic (though I haven't properly looked for it) which
decides the requested target is no good for the PCollection provided, so
the exception should be raised there with a message explaining this.

What do you think?

I'm sure there's a JIRA ticket lurking somewhere in all this - I'm just not
sure what it is! :)

Thanks,
Dave

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message