hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: [DISCUSS] More Shading
Date Wed, 12 Apr 2017 21:30:10 GMT
Thanks for the great input all.

See below:

On Wed, Apr 12, 2017 at 9:01 AM, Nick Dimiduk <ndimiduk@gmail.com> wrote:

> On Wed, Apr 12, 2017 at 8:28 AM Josh Elser <elserj@apache.org> wrote:
> >
> >
> > Sean Busbey wrote:
> > > On Tue, Apr 11, 2017 at 11:43 PM Nick Dimiduk<ndimiduk@gmail.com>
> > wrote:
> > >
> > >>> This effort is about our internals. We have a mess of other
> components
> > >> all
> > >>> up inside us such as HDFS, etc., each with their own sets of
> > dependencies
> > >>> many of which we have in common. This project t is about making it
> > we
> > >>> can upgrade at a rate independent of when our upstreamers choose to
> > >> change.
> > >>

(I'd add to the above that we can upgrade libs w/o breaking downstreamers
also -- but this point becomes an intrinsic later in the thread)

> >> If the above quote is true, then I think what we want is a set of
> shaded
> > >> Hadoop client libs that we can depend on so as to not get all the
> > >> transitive deps. Hadoop doesn't provide it, but we could do so
> ourselves
> > >> with (yet another) module in our project. Assuming, that is, the
> > upstream
> > >> client interfaces are well defined and don't leak stuff we care about.

We should do this too (I think you've identified the big 'if' w/ the above
identified assumption). As you say later, "... it's time we firm up the
boundaries between us and Hadoop.". There is some precedent with
hadoop-compat-* modules. Hadoop would be relocated?

Spitballing, IIUC, I think this would be a big job (once per version and
the vagaries of hadoop/spark) with no guarantee of success on other end
because of assumption you call out. Do I have this right?


> Isolating our clients from our deps is best served by our shaded modules.
> What do you think about turning things on their head: for 2.0 the
> hbase-client jar is the shaded artifact by default, not the other way
> around? We have cleanup to get our deps out of our public interfaces in
> order to make this work.
We should do this at least going forward. hbase2 is the opportunity.
Testing and doc is all that is needed? I added it to our hbase2 description
doc as a deliverable (though not a blocker).

> This proposal of an external shaded dependencies module sounds like an
> attempt to solve both concerns at once. It would isolate ourselves from
> Hadoop's deps, and it would isolate our clients from our deps. However, it
> doesn't isolate our clients from Hadoop's deps, so our users don't really
> gain anything from it. I also argue that it creates an unreasonable release
> engineering burden on our project. I'm also not clear on the implications
> to downstreamers who extend us with coprocessors.

Other than a missing 'quick-fix' descriptor, you call what is proposed well
....except where you think the prebuild will be burdensome. Here I think
otherwise as I think releases will be rare, there is nought 'new' in a
release but packaged 3rd-party libs, and verification/vote by PMCers should
be a simple affair.

Do you agree that the fixing-what-we-leak-of-hadoop-to-downstreamers is
distinct from the narrower task proposed here where we are trying to
unhitch ourselves of the netty/guava hadoop uses? (Currently we break
against hadoop3 because of netty incompat., HADOOP-13866, which we might be
able to solve w/ exclusions.....but....).

The two tasks can be run in parallel?

For CPs, they should bring their own bedding and towels and not be trying
to use ours. On the plus-side, we could upgrade core 3rd-party libs and the
CP would keep working.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message