crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Beech <>
Subject Misleading error message
Date Fri, 28 Jun 2013 15:07:44 GMT
Hi all,

Please take a look at the following pipeline:

read(From.textFile(args[0])).write(To.textFile(args[1] + "-text"));
read(From.textFile(args[0])).write(To.sequenceFile(args[1] + "-seq"));
read(From.textFile(args[0])).write(To.avroFile(args[1] + "-avro"));

The first two jobs are fine, and give correct output types of text and
sequence files respectively. The text to avro conversion fails. This is no
great surprise, knowing a little about the internals of Crunch, but when
put alongside the other examples it feels like it should work.

Even if it can't work - no big deal, it's just a toy example. The main
problem for me was the error message:

13/06/28 14:11:40 INFO jobcontrol.CrunchControlledJob:
org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.
at org.apache.hadoop.mapred.JobClient$
at org.apache.hadoop.mapred.JobClient$

I think the job should have been killed somewhere before this point. There
must be a bit of logic (though I haven't properly looked for it) which
decides the requested target is no good for the PCollection provided, so
the exception should be raised there with a message explaining this.

What do you think?

I'm sure there's a JIRA ticket lurking somewhere in all this - I'm just not
sure what it is! :)


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message