hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject [DISCUSSION] Upgrading core dependencies
Date Tue, 07 Feb 2017 20:21:57 GMT
Here's an old thorny issue that won't go away. I'd like to hear what folks
are thinking these times.

My immediate need is that I want to upgrade Guava [1]. I want to move us to
guava 21.0, the latest release [2]. We currently depend on guava 12.0.
Hadoop's guava -- 11.0 -- is also on our CLASSPATH (three times). We could
just do it in an hbase-2.0.0, a major version release, but then
downstreamers and coprocessors that may have been a little lazy and that
have transitively come to depend on our versions of libs will break [3].
Then there is the murky area around the running of YARN/MR/Spark jobs where
the ordering of libs on the CLASSPATH gets interesting where fat-jaring or
command-line antics can get you over (most) problems if you persevere.

Multiply the above by netty, jackson, and a few other favorites.

Our proffered solution to the above is the shaded hbase artifact project;
have applications and tasks refer to the shaded hbase client instead.
Because we've not done the work to narrow the surface area we expose to
downstreamers, most consumers of our API -- certainly in a spark/MR context
since our MR utility is buried in hbase-server module still -- need both
the shaded hbase client and server on their CLASSPATH (i.e. near all of

Leaving aside for the moment that our shaded client and server need
untangling, getting folks up on the shaded artifacts takes effort
evangelizing. We also need to be doing work to make sure our shading
doesn't leak dependencies, that it works for all deploy scenarios, and that
this route forward is well doc'd, and so on.

I don't see much evidence of our pushing the shaded artifacts route nor of
their being used. What is the perception of others?

I played with adding a new module to host shaded 3rd party libs[4]. The
downsides are a couple; would have to internally, refer to the offset
version of the lib and we bulk up our tarball by a bunch of megs (Build
gets a few seconds longer, not much). Upside is that we can float over a
variety of hadoop/spark versions using whatever guava or netty we want;
downstreamers and general users should have an easier time of it too
because they'll be less likely to run into library clashes. is this project
worth finishing?


1. I wanted to make use of the protobuf to-json tool. It is in the
extra-jar, protobuf-util. It requires a guava 16.0.
2. Guava is a quality lib that should be at the core of all our dev but we
are gun shy around using it because it semver's with gusto at a rate that
is orders of magnitude in advance of the Hadoop/HBase cadence.
3. We are trying to minimize breakage when we go to hbase-2.0.0.
4. HBASE-15749 suggested this but was shutdown because it made no case for
why we'd want to do it.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message