incubator-crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Friedrich <m...@mafr.de>
Subject Re: Crunch with Elastic MapReduce
Date Wed, 15 Aug 2012 06:48:19 GMT
Hi Shawn,

thanks a lot for your mail and your patch! Answers see below.

On Tuesday, 2012-08-14, Shawn Smith wrote:
[...]
> 3. EMR Hadoop 1.0.3 includes Avro 1.5.3 which apparently takes precedence
> over Crunch's Avro 1.7.0.  I didn't mess around with trying to get my classes
> in the class path first…  Instead I used the maven-shade-plugin in my job's
> build to shade Avro 1.7.0 from "org.apache.avro.*" to
> "shaded.org.apache.avro.*" so it wouldn't conflict with the EMR version of
> Avro.  Example exception (you can see the Avro source code line numbers
> correspond to version 1.5.3):
[...]

Hadoop provides no classloader isolation, I've been bitten by this several
times, too. There's a crude workaround you can try:

    export HADOOP_USER_CLASSPATH_FIRST=true

You have to set it before running the hadoop script. I don't see other
options at this point until Hadoop is fixed.

[...]
> 4. EMR Hadoop 1.0.3 includes two different versions of SLF4J in the class path: 1.4.3
and 1.6.4.  As a result, jobs that use SLF4J will fail non-deterministically when a particular
run uses slf4j-api-1.4.3.jar with slf4j-log4j12-1.6.4.jar, as described in the SLF4J FAQ (http://www.slf4j.org/faq.html#IllegalAccessError).
 It looks like you can workaround the problem by using shaded SLF4J jars and not relying on
the ones provided by the Hadoop distribution.  The stack trace looks something like this:

That's a nasty bug in EMR's Hadoop. We've recently downgraded our dependency
to slf4j 1.4.3 and I'm going to try fixing some more of these problems in
CRUNCH-16.

Please let us know if you experience any more problems!

Regards,
  Matthias

Mime
View raw message