crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Friedrich <>
Subject Dependency conflicts
Date Sun, 05 Aug 2012 09:10:45 GMT

I spent most of Saturday resolving dependency conflicts for CRUNCH-16.
Since nobody's going to read a long mail, here are the cliff notes:

hadoop-core-1.0.3, hbase-0.90.5, and avro-1.7.0 are incompatible and
I found no safe solution to fix it. Moving HBase support to a separate
Maven module may be the best solution because it reduces risk for
users who don't need HBase.

The longer version:

The POM of hadoop-core-1.0.3 is in a sorry state. It doesn't list all
libraries that are on the runtime classpath, and of these, some are
wrong. For example, integration tests using LocalJobRunner don't work
unless you add more dependencies yourself (ie. commons-io). Also, roughly
a dozen of hbase-0.90.5's 40 dependencies are in conflict with
hadoop-core-1.0.3. This means we have to add quite a few "provided"
dependencies with the correct versions ourselves, but these aren't
propagated to our users so they have to do the same or risk conflicts
at runtime.

I resolved the conflicts to a point where our integration tests work
which is unfortunately no guarantee that things will work for our users.
Using the dependencies of hadoop-core-1.0.3 + Crunch's, the source
distribution of hbase-0.90.5 doesn't even compile. At an interface
level, it is incompatible with protobuf-java-2.4.1 (easy enough to fix)
and avro-1.7.0 (not so easy to fix). Changing only those dependencies
that are interface compatible (about a dozen) unsurprisingly leads to
HBase test case failures. This may not affect HBase clients, but you
never know. There is no hbase-client library so you always get
everything unless you know HBase well enough to get your exclusions

So, where do we go from here? I can get a patch ready that paints
over some of these problems and makes sure that the dependencies we
use in our test cases are the same as during runtime. But I really
need careful review for this.

To be honest, this situation leaves me a bit uneasy. Maybe the best
long term solution would be to move HBase support to a separate Maven
module that depends on crunch core and not force it on everyone. This
will reduce risk greatly for those who don't need HBase. I think it's
definitely worth giving it a shot.

What do you think, guys?


View raw message