hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: [DISCUSS] More Shading
Date Mon, 24 Apr 2017 16:39:35 GMT
FYI, MNG-5899 makes shaded builds fragile, effectively limiting
multi-module shaded projects to maven 3.2.x. Apparently the Apache Storm
folks tripped over this earlier, and as I recall, Apache Flink used to
require building with 3.2.x for the same reason.


On Tue, Apr 18, 2017 at 9:20 PM, Nick Dimiduk <ndimiduk@gmail.com> wrote:

> On Wed, Apr 12, 2017 at 2:30 PM, Stack <stack@duboce.net> wrote:
>> > >> If the above quote is true, then I think what we want is a set of
>> > shaded
>> > > >> Hadoop client libs that we can depend on so as to not get all
>> > > >> transitive deps. Hadoop doesn't provide it, but we could do so
>> > ourselves
>> > > >> with (yet another) module in our project. Assuming, that is, the
>> > > upstream
>> > > >> client interfaces are well defined and don't leak stuff we care
>> about.
>> >
>> We should do this too (I think you've identified the big 'if' w/ the above
>> identified assumption). As you say later, "... it's time we firm up the
>> boundaries between us and Hadoop.". There is some precedent with
>> hadoop-compat-* modules. Hadoop would be relocated?
> Ideally we'd relocate any parts of Hadoop that are not part of our public
> contract. Not sure if there's an intersection between "ideal" and
> "practical" though.
> Spitballing, IIUC, I think this would be a big job (once per version and
>> the vagaries of hadoop/spark) with no guarantee of success on other end
>> because of assumption you call out. Do I have this right?
> Yeah you have my meaning. My argument is not whether we should shade but
> rather how we make it a maintainable deployment tool for our team of
> volunteers. Hence interest in compatibility verification tools like we do
> with our api compatibility tools.
> > Isolating our clients from our deps is best served by our shaded modules.
>> > What do you think about turning things on their head: for 2.0 the
>> > hbase-client jar is the shaded artifact by default, not the other way
>> > around? We have cleanup to get our deps out of our public interfaces in
>> > order to make this work.
>> >
>> >
>> We should do this at least going forward. hbase2 is the opportunity.
>> Testing and doc is all that is needed? I added it to our hbase2
>> description
>> doc as a deliverable (though not a blocker).
> I've not tried to consume these efforts. A reasonable test-case to see if
> these are ready for prime-time would be to try rebuilding one of the more
> complex downstream projects (i.e, Phoenix, Trafodion, Splice) using the
> shaded jars and see how bad the diff is.
> > This proposal of an external shaded dependencies module sounds like an
>> > attempt to solve both concerns at once. It would isolate ourselves from
>> > Hadoop's deps, and it would isolate our clients from our deps. However,
>> it
>> > doesn't isolate our clients from Hadoop's deps, so our users don't
>> really
>> > gain anything from it. I also argue that it creates an unreasonable
>> release
>> > engineering burden on our project. I'm also not clear on the
>> implications
>> > to downstreamers who extend us with coprocessors.
>> >
>> Other than a missing 'quick-fix' descriptor, you call what is proposed
>> well
>> ....except where you think the prebuild will be burdensome. Here I think
>> otherwise as I think releases will be rare, there is nought 'new' in a
>> release but packaged 3rd-party libs, and verification/vote by PMCers
>> should
>> be a simple affair.
> Maybe it's not such a burden? If the 2.0 and 3.0 RM's are brave and true,
> it's worth a go.
> Do you agree that the fixing-what-we-leak-of-hadoop-to-downstreamers is
>> distinct from the narrower task proposed here where we are trying to
>> unhitch ourselves of the netty/guava hadoop uses? (Currently we break
>> against hadoop3 because of netty incompat., HADOOP-13866, which we might
>> be
>> able to solve w/ exclusions.....but....).
>> The two tasks can be run in parallel?
> Indeed, they seem distinct but quite related.
> For CPs, they should bring their own bedding and towels and not be trying
>> to use ours. On the plus-side, we could upgrade core 3rd-party libs and
>> the
>> CP would keep working.
> All of this sounds like an ideal state.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message