avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: How to work around MAPREDUCE-1700
Date Thu, 12 Aug 2010 23:19:52 GMT
Yup, Hadoop's jar dependency management is poor.

I just repackaged hadoop and removed jackson 1.0.1 jar from it, so no slaves have Jackson
in their lib directory.  Its only used for the 'dump configuration in JSON format' feature,
which I don't use (and a minor feature that adds a jar should probably have gotten a lot of
scrutiny before adding the jar/feature).  

Alternatively, since Jackson 1.x is backwards compatible to 1.0, one can replace 1.0.1 with
1.x.x on the slaves or in the package.

On Aug 12, 2010, at 3:40 PM, David Rosenstrauch wrote:

> Anyone have any ideas how I might be able to work around 
> https://issues.apache.org/jira/browse/MAPREDUCE-1700 ?  It's quite a 
> thorny issue!
> I have a M/R job that's using Avro (v1.3.3).  Avro, in turn, has a 
> dependency on Jackson (of which I'm using v1.5.4).  I'm able to add the 
> jars to the distributed cache fine, and my Mapper starts to run and load 
> Avro ... and then blammo:  "Error: 
> org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory;"
> The problem is that there's already an older (and obviously 
> incompatible) version of Jackson (v1.0.1) that's already included in the 
> Hadoop distribution.  And since that appears earlier on the classpath 
> than my Jackson jars, I get the error.
> There doesn't seem to be any elegant solution to this.  I can't 
> downgrade to an earlier version of Avro, as my code relies on features 
> in the newer version.  And there doesn't seem to be any way 
> configuration-wise to solve this either (i.e., tell Hadoop to use the 
> newer Jackson jars for my M/R job, or to add those jars earlier on the 
> classpath).
> Near as I can tell, the only solutions involve doing a hack on each of 
> my slave nodes.  I.e., either:
> a) removing the existing jackson jars on each slave.  (Since I have no 
> need for the Hadoop feature that requires that Jackson.)
> b) putting my newer jackson jars onto each slave in a place where it 
> will be loaded before the older one (e.g., 
> /usr/lib/hadoop-0.20/lib/aaaaa_jackson-core-asl-1.5.4.jar)
> Either of these options is a bit of a hack - and error prone as well, 
> since my job tasks will fail on any node that doesn't have this hack 
> applied.
> Is there any cleaner way to resolve this issue that I'm not seeing?
> Thanks,
> DR

View raw message