hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: [DISCUSSION] Upgrading core dependencies
Date Wed, 08 Feb 2017 22:36:16 GMT
(late to the party, but..)

+1 Nick sums this up better than I could have.

Nick Dimiduk wrote:
> For the client: I'm a fan of shaded client modules by default and
> minimizing the exposure of that surface area of 3rd party libs (none, if
> possible). For example, Elastic Search has a similar set of challenges, the
> solve it by advocating users shade from step 1. It's addressed first thing
> in the docs for their client libs. We could take it a step further by
> making the shaded client the default client (o.a.hbase:hbase-client)
> artifact and internally consume an hbase-client-unshaded. Turns the whole
> thing on it's head in a way that's better for the naive user.
> For MR/Spark/etc connectors: We're probably stuck as it is until necessary
> classes can be extracted from hbase-server. I haven't looked into this
> lately, so I hesitate to give a prescription.
> For coprocessors: They forfeit their right to 3rd party library dependency
> stability by entering our process space. Maybe in 3.0 or 4.0 we can rebuild
> on jigsaw or OSGi, but for today I think the best we should do is provide
> relatively stable internal APIs. I also find it unlikely that we'd want to
> spend loads of cycles optimizing for this usecase. There's other, bigger
> fish, IMHO.
> For size/compile time: I think these ultimately matter less than user
> experience. Let's find a solution that sucks less for downstreamers and
> work backward on reducing bloat.
> On the point of leaning heavily on Guava: their pace is traditionally too
> fast for us to expose in any public API. Maybe that's changing, in which
> case we could reconsider for 3.0. Better to start using the new API's
> available in Java 8...
> Thanks for taking this up, Stack.
> -n
> On Tue, Feb 7, 2017 at 12:22 PM Stack<stack@duboce.net>  wrote:
>> Here's an old thorny issue that won't go away. I'd like to hear what folks
>> are thinking these times.
>> My immediate need is that I want to upgrade Guava [1]. I want to move us to
>> guava 21.0, the latest release [2]. We currently depend on guava 12.0.
>> Hadoop's guava -- 11.0 -- is also on our CLASSPATH (three times). We could
>> just do it in an hbase-2.0.0, a major version release, but then
>> downstreamers and coprocessors that may have been a little lazy and that
>> have transitively come to depend on our versions of libs will break [3].
>> Then there is the murky area around the running of YARN/MR/Spark jobs where
>> the ordering of libs on the CLASSPATH gets interesting where fat-jaring or
>> command-line antics can get you over (most) problems if you persevere.
>> Multiply the above by netty, jackson, and a few other favorites.
>> Our proffered solution to the above is the shaded hbase artifact project;
>> have applications and tasks refer to the shaded hbase client instead.
>> Because we've not done the work to narrow the surface area we expose to
>> downstreamers, most consumers of our API -- certainly in a spark/MR context
>> since our MR utility is buried in hbase-server module still -- need both
>> the shaded hbase client and server on their CLASSPATH (i.e. near all of
>> hbase).
>> Leaving aside for the moment that our shaded client and server need
>> untangling, getting folks up on the shaded artifacts takes effort
>> evangelizing. We also need to be doing work to make sure our shading
>> doesn't leak dependencies, that it works for all deploy scenarios, and that
>> this route forward is well doc'd, and so on.
>> I don't see much evidence of our pushing the shaded artifacts route nor of
>> their being used. What is the perception of others?
>> I played with adding a new module to host shaded 3rd party libs[4]. The
>> downsides are a couple; would have to internally, refer to the offset
>> version of the lib and we bulk up our tarball by a bunch of megs (Build
>> gets a few seconds longer, not much). Upside is that we can float over a
>> variety of hadoop/spark versions using whatever guava or netty we want;
>> downstreamers and general users should have an easier time of it too
>> because they'll be less likely to run into library clashes. is this project
>> worth finishing?
>> WDYT?
>> St.Ack
>> 1. I wanted to make use of the protobuf to-json tool. It is in the
>> extra-jar, protobuf-util. It requires a guava 16.0.
>> 2. Guava is a quality lib that should be at the core of all our dev but we
>> are gun shy around using it because it semver's with gusto at a rate that
>> is orders of magnitude in advance of the Hadoop/HBase cadence.
>> 3. We are trying to minimize breakage when we go to hbase-2.0.0.
>> 4. HBASE-15749 suggested this but was shutdown because it made no case for
>> why we'd want to do it.

View raw message