incubator-crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Friedrich <>
Subject Re: Crunch with Elastic MapReduce
Date Wed, 15 Aug 2012 06:48:19 GMT
Hi Shawn,

thanks a lot for your mail and your patch! Answers see below.

On Tuesday, 2012-08-14, Shawn Smith wrote:
> 3. EMR Hadoop 1.0.3 includes Avro 1.5.3 which apparently takes precedence
> over Crunch's Avro 1.7.0.  I didn't mess around with trying to get my classes
> in the class path first…  Instead I used the maven-shade-plugin in my job's
> build to shade Avro 1.7.0 from "org.apache.avro.*" to
> "*" so it wouldn't conflict with the EMR version of
> Avro.  Example exception (you can see the Avro source code line numbers
> correspond to version 1.5.3):

Hadoop provides no classloader isolation, I've been bitten by this several
times, too. There's a crude workaround you can try:


You have to set it before running the hadoop script. I don't see other
options at this point until Hadoop is fixed.

> 4. EMR Hadoop 1.0.3 includes two different versions of SLF4J in the class path: 1.4.3
and 1.6.4.  As a result, jobs that use SLF4J will fail non-deterministically when a particular
run uses slf4j-api-1.4.3.jar with slf4j-log4j12-1.6.4.jar, as described in the SLF4J FAQ (
 It looks like you can workaround the problem by using shaded SLF4J jars and not relying on
the ones provided by the Hadoop distribution.  The stack trace looks something like this:

That's a nasty bug in EMR's Hadoop. We've recently downgraded our dependency
to slf4j 1.4.3 and I'm going to try fixing some more of these problems in

Please let us know if you experience any more problems!


View raw message