hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@apache.org>
Subject Re: [DISCUSS] More Shading
Date Wed, 12 Apr 2017 13:06:25 GMT
On Tue, Apr 11, 2017 at 11:43 PM Nick Dimiduk <ndimiduk@gmail.com> wrote:

> > This effort is about our internals. We have a mess of other components
> all
> > up inside us such as HDFS, etc., each with their own sets of dependencies
> > many of which we have in common. This project t is about making it so we
> > can upgrade at a rate independent of when our upstreamers choose to
> change.
> Pardon as I try to get a handle on the intention behind this thread.
> If the above quote is true, then I think what we want is a set of shaded
> Hadoop client libs that we can depend on so as to not get all the
> transitive deps. Hadoop doesn't provide it, but we could do so ourselves
> with (yet another) module in our project. Assuming, that is, the upstream
> client interfaces are well defined and don't leak stuff we care about. It
> also creates a terrible nightmare for anyone downstream of us who
> repackages HBase. The whole thing is extremely error-prone, because there's
> not very good tooling for this. Realistically, we end up with a combination
> of the enforcer plugin and maybe our own custom plugin to ensure clean
> transitive dependencies...
Hadoop does provide a shaded client as of the 3.0.0* release line. We could
push as a community for a version of that for Hadoop's branch-2.

Unfortunately, that shaded client won't help where we're reaching into the
guts of Hadoop (like our reliance on their web stuff).

> I guess the suggestion of the external repo containing our shaded fork of
> everything we depend on allows us to continue to compile, run on Hadoop's
> transitive dependency list w.o actually using any of it, I have that right?
> How would we version this thing?

Yes, that's correct.  simplest would be to version it similar to how we do
now, starting at version 1.0.0 and bump whenever we change a dependency.

> Between these two choices, I prefer the former as a "more correct"
> solution, but it depends entirely on how clean of a shaded hadoop we can
> reliably produce inline our build.

If we're going to try to go the route of cleaning up how we rely on Hadoop,
the bigger issue IMHO is getting ourselves off of things not included in
their client jars.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message