hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: [DISCUSS] More Shading
Date Wed, 19 Apr 2017 04:20:47 GMT
On Wed, Apr 12, 2017 at 2:30 PM, Stack <stack@duboce.net> wrote:

> > >> If the above quote is true, then I think what we want is a set of
> > shaded
> > > >> Hadoop client libs that we can depend on so as to not get all the
> > > >> transitive deps. Hadoop doesn't provide it, but we could do so
> > ourselves
> > > >> with (yet another) module in our project. Assuming, that is, the
> > > upstream
> > > >> client interfaces are well defined and don't leak stuff we care
> about.
> >
> We should do this too (I think you've identified the big 'if' w/ the above
> identified assumption). As you say later, "... it's time we firm up the
> boundaries between us and Hadoop.". There is some precedent with
> hadoop-compat-* modules. Hadoop would be relocated?

Ideally we'd relocate any parts of Hadoop that are not part of our public
contract. Not sure if there's an intersection between "ideal" and
"practical" though.

Spitballing, IIUC, I think this would be a big job (once per version and
> the vagaries of hadoop/spark) with no guarantee of success on other end
> because of assumption you call out. Do I have this right?

Yeah you have my meaning. My argument is not whether we should shade but
rather how we make it a maintainable deployment tool for our team of
volunteers. Hence interest in compatibility verification tools like we do
with our api compatibility tools.

> Isolating our clients from our deps is best served by our shaded modules.
> > What do you think about turning things on their head: for 2.0 the
> > hbase-client jar is the shaded artifact by default, not the other way
> > around? We have cleanup to get our deps out of our public interfaces in
> > order to make this work.
> >
> >
> We should do this at least going forward. hbase2 is the opportunity.
> Testing and doc is all that is needed? I added it to our hbase2 description
> doc as a deliverable (though not a blocker).

I've not tried to consume these efforts. A reasonable test-case to see if
these are ready for prime-time would be to try rebuilding one of the more
complex downstream projects (i.e, Phoenix, Trafodion, Splice) using the
shaded jars and see how bad the diff is.

> This proposal of an external shaded dependencies module sounds like an
> > attempt to solve both concerns at once. It would isolate ourselves from
> > Hadoop's deps, and it would isolate our clients from our deps. However,
> it
> > doesn't isolate our clients from Hadoop's deps, so our users don't really
> > gain anything from it. I also argue that it creates an unreasonable
> release
> > engineering burden on our project. I'm also not clear on the implications
> > to downstreamers who extend us with coprocessors.
> >
> Other than a missing 'quick-fix' descriptor, you call what is proposed well
> ....except where you think the prebuild will be burdensome. Here I think
> otherwise as I think releases will be rare, there is nought 'new' in a
> release but packaged 3rd-party libs, and verification/vote by PMCers should
> be a simple affair.

Maybe it's not such a burden? If the 2.0 and 3.0 RM's are brave and true,
it's worth a go.

Do you agree that the fixing-what-we-leak-of-hadoop-to-downstreamers is
> distinct from the narrower task proposed here where we are trying to
> unhitch ourselves of the netty/guava hadoop uses? (Currently we break
> against hadoop3 because of netty incompat., HADOOP-13866, which we might be
> able to solve w/ exclusions.....but....).
> The two tasks can be run in parallel?

Indeed, they seem distinct but quite related.

For CPs, they should bring their own bedding and towels and not be trying
> to use ours. On the plus-side, we could upgrade core 3rd-party libs and the
> CP would keep working.

All of this sounds like an ideal state.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message