accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <bimargul...@gmail.com>
Subject Re: Hadoop 2 compatibility issues
Date Wed, 15 May 2013 00:27:11 GMT
Maven will malfunction in various entertaining ways if you try to
change the GAV of the output of the build using a profile.

Maven will malfunction in various entertaining ways if you use
classifiers on real-live-JAR files that get used as
real-live-dependencies, because it has no concept of a
pom-per-classifier.

Where does this leave you/us? (I'm not sure that I've earned an 'us'
recently around here.)

First, I note that 'Apache releases are source releases'. So, one
resort of scoundrels here would be to support only one hadoop in the
convenience binaries that get pushed to Maven Central, and let other
hadoop users take the source release and build for themselves.

Second, I am reduced to suggesting an elaboration of the build in
which some tool edits poms and runs builds. The maven-invoker-plugin
could be used to run that, but a plain old script in a plain old
language might be less painful.

I appreciate that this may not be an appealing contribution to where
things are, but it might be the best of the evil choices.


On Tue, May 14, 2013 at 7:50 PM, John Vines <vines@apache.org> wrote:
> The compiled code is compiled code. There are no concerns of dependency
> resolution. So I see no issues in using the profile to define the gav if
> that is feasible.
>
> Sent from my phone, please pardon the typos and brevity.
> On May 14, 2013 7:47 PM, "Christopher" <ctubbsii@apache.org> wrote:
>
>> Response to Benson inline, but additional note here:
>>
>> It should be noted that the situation will be made worse for the
>> solution I was considering for ACCUMULO-1402, which would move the
>> accumulo artifacts, classified by the hadoop2 variant, into the
>> profiles... meaning they will no longer resolve transitively when they
>> did before. Can go into details on that ticket, if needed.
>>
>> On Tue, May 14, 2013 at 7:41 PM, Benson Margulies <bimargulies@gmail.com>
>> wrote:
>> > On Tue, May 14, 2013 at 7:36 PM, Christopher <ctubbsii@apache.org>
>> wrote:
>> >> Benson-
>> >>
>> >> They produce different byte-code. That's why we're even considering
>> >> this. ACCUMULO-1402 is the ticket under which our intent is to add
>> >> classifiers, so that they can be distinguished.
>> >
>> > whoops, missed that.
>> >
>> > Then how do people succeed in just fixing up their dependencies and
>> using it?
>>
>> The specific differences are things like changes from abstract class
>> to an interface. Apparently an import of these do not produce
>> compatible byte-code, even though the method signature looks the same.
>>
>> > In any case, speaking as a Maven-maven, classifiers are absolutely,
>> > positively, a cure worse than the disease. If you want the details
>> > just ask.
>>
>> Agreed. I just don't see a good alternative here.
>>
>> >>
>> >> All-
>> >>
>> >> To Keith's point, I think perhaps all this concern is a non-issue...
>> >> because as Keith points out, the dependencies in question are marked
>> >> as "provided", and dependency resolution doesn't occur for provided
>> >> dependencies anyway... so even if we leave off the profiles, we're in
>> >> the same boat. Maybe not the boat we should be in... but certainly not
>> >> a sinking one as I had first imagined. It's as afloat as it was
>> >> before, when they were not in a profile, but still marked as
>> >> "provided".
>> >>
>> >> --
>> >> Christopher L Tubbs II
>> >> http://gravatar.com/ctubbsii
>> >>
>> >>
>> >> On Tue, May 14, 2013 at 7:09 PM, Benson Margulies <
>> bimargulies@gmail.com> wrote:
>> >>> I just doesn't make very much sense to me to have two different GAV's
>> >>> for the very same .class files, just to get different dependencies in
>> >>> the poms. However, if someone really wanted that, I'd look to make
>> >>> some scripting that created this downstream from the main build.
>> >>>
>> >>>
>> >>> On Tue, May 14, 2013 at 6:16 PM, John Vines <vines@apache.org>
wrote:
>> >>>> They're the same currently. I was requesting separate gavs for hadoop
>> 2.
>> >>>> It's been on the mailing list and jira.
>> >>>>
>> >>>> Sent from my phone, please pardon the typos and brevity.
>> >>>> On May 14, 2013 6:14 PM, "Keith Turner" <keith@deenlo.com>
wrote:
>> >>>>
>> >>>>> On Tue, May 14, 2013 at 5:51 PM, Benson Margulies <
>> bimargulies@gmail.com
>> >>>>> >wrote:
>> >>>>>
>> >>>>> > I am a maven developer, and I'm offering this advice based
on my
>> >>>>> > understanding of reason why that generic advice is offered.
>> >>>>> >
>> >>>>> > If you have different profiles that _build different results_
but
>> all
>> >>>>> > deliver the same GAV, you have chaos.
>> >>>>> >
>> >>>>>
>> >>>>> What GAV are we currently producing for hadoop 1 and hadoop
2?
>> >>>>>
>> >>>>>
>> >>>>> >
>> >>>>> > If you have different profiles that test against different
>> versions of
>> >>>>> > dependencies, but all deliver the same byte code at the
end of the
>> >>>>> > day, you don't have chaos.
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > On Tue, May 14, 2013 at 5:48 PM, Christopher <ctubbsii@apache.org>
>> >>>>> wrote:
>> >>>>> > > I think it's interesting that Option 4 seems to be
most
>> preferred...
>> >>>>> > > because it's the *only* option that is explicitly
advised
>> against by
>> >>>>> > > the Maven developers (from the information I've read).
I can see
>> its
>> >>>>> > > appeal, but I really don't think that we should introduce
an
>> explicit
>> >>>>> > > problem for users (that applies to users using even
the Hadoop
>> version
>> >>>>> > > we directly build against... not just those using
Hadoop 2... I
>> don't
>> >>>>> > > know if that point was clear), to only partially support
a
>> version of
>> >>>>> > > Hadoop that is still alpha and has never had a stable
release.
>> >>>>> > >
>> >>>>> > > BTW, Option 4 was how I had have achieved a solution
for
>> >>>>> > > ACCUMULO-1402, but am reluctant to apply that patch,
with this
>> issue
>> >>>>> > > outstanding, as it may exacerbate the problem.
>> >>>>> > >
>> >>>>> > > Another implication for Option 4 (the current "solution")
is for
>> >>>>> > > 1.6.0, with the planned accumulo-maven-plugin... because
it
>> means that
>> >>>>> > > the accumulo-maven-plugin will need to be configured
like this:
>> >>>>> > > <plugin>
>> >>>>> > >   <groupId>org.apache.accumulo</groupId>
>> >>>>> > >   <artifactId>accumulo-maven-plugin</artifactId>
>> >>>>> > >   <dependencies>
>> >>>>> > >    ... all the required hadoop 1 dependencies to make
the plugin
>> work,
>> >>>>> > > even though this version only works against hadoop
1 anyway...
>> >>>>> > >   </dependencies>
>> >>>>> > >   ...
>> >>>>> > > </plugin>
>> >>>>> > >
>> >>>>> > > --
>> >>>>> > > Christopher L Tubbs II
>> >>>>> > > http://gravatar.com/ctubbsii
>> >>>>> > >
>> >>>>> > >
>> >>>>> > > On Tue, May 14, 2013 at 5:42 PM, Christopher <
>> ctubbsii@apache.org>
>> >>>>> > wrote:
>> >>>>> > >> I think Option 2 is the best solution for "waiting
until we
>> have the
>> >>>>> > >> time to solve the problem correctly", as it ensures
that
>> transitive
>> >>>>> > >> dependencies work for the stable version of Hadoop,
and using
>> Hadoop2
>> >>>>> > >> is a very simple documentation issue for how to
apply the patch
>> and
>> >>>>> > >> rebuild. Option 4 doesn't wait... it explicitly
introduces a
>> problem
>> >>>>> > >> for users.
>> >>>>> > >>
>> >>>>> > >> Option 1 is how I'm tentatively thinking about
fixing it
>> properly in
>> >>>>> > 1.6.0.
>> >>>>> > >>
>> >>>>> > >>
>> >>>>> > >> --
>> >>>>> > >> Christopher L Tubbs II
>> >>>>> > >> http://gravatar.com/ctubbsii
>> >>>>> > >>
>> >>>>> > >>
>> >>>>> > >> On Tue, May 14, 2013 at 4:56 PM, John Vines <vines@apache.org>
>> wrote:
>> >>>>> > >>> I'm an advocate of option 4. You say that
it's ignoring the
>> problem,
>> >>>>> > >>> whereas I think it's waiting until we have
the time to solve
>> the
>> >>>>> > problem
>> >>>>> > >>> correctly. Your reasoning for this is for
standardizing for
>> maven
>> >>>>> > >>> conventions, but the other options, while
more 'correct' from
>> a maven
>> >>>>> > >>> standpoint or a larger headache for our user
base and
>> ourselves. In
>> >>>>> > either
>> >>>>> > >>> case, we're going to be breaking some sort
of convention, and
>> while
>> >>>>> > it's
>> >>>>> > >>> not good, we should be doing the one that's
less bad for US.
>> The
>> >>>>> > important
>> >>>>> > >>> thing here, now, is that the poms work and
we should go with
>> the
>> >>>>> method
>> >>>>> > >>> that leaves the work minimal for our end users
to utilize them.
>> >>>>> > >>>
>> >>>>> > >>> I do agree that 1. is the correct option in
the long run. More
>> >>>>> > >>> specifically, I think it boils down to having
a single module
>> >>>>> > compatibility
>> >>>>> > >>> layer, which is how hbase deals with this
issue. But like you
>> said,
>> >>>>> we
>> >>>>> > >>> don't have the time to engineer a proper solution.
So let
>> sleeping
>> >>>>> > dogs lie
>> >>>>> > >>> and we can revamp the whole system for 1.5.1
or 1.6.0 when we
>> have
>> >>>>> the
>> >>>>> > >>> cycles to do it right.
>> >>>>> > >>>
>> >>>>> > >>>
>> >>>>> > >>> On Tue, May 14, 2013 at 4:40 PM, Christopher
<
>> ctubbsii@apache.org>
>> >>>>> > wrote:
>> >>>>> > >>>
>> >>>>> > >>>> So, I've run into a problem with ACCUMULO-1402
that requires a
>> >>>>> larger
>> >>>>> > >>>> discussion about how Accumulo 1.5.0 should
support Hadoop2.
>> >>>>> > >>>>
>> >>>>> > >>>> The problem is basically that profiles
should not contain
>> >>>>> > >>>> dependencies, because profiles don't get
activated
>> transitively. A
>> >>>>> > >>>> slide deck by the Maven developers point
this out as a bad
>> >>>>> practice...
>> >>>>> > >>>> yet it's a practice we rely on for our
current implementation
>> of
>> >>>>> > >>>> Hadoop2 support
>> >>>>> > >>>> (
>> >>>>> http://www.slideshare.net/aheritier/geneva-jug-30th-march-2010-maven
>> >>>>> > >>>> slide 80).
>> >>>>> > >>>>
>> >>>>> > >>>> What this means is that even if we go
through the work of
>> publishing
>> >>>>> > >>>> binary artifacts compiled against Hadoop2,
neither our Hadoop1
>> >>>>> > >>>> binaries or our Hadoop2 binaries will
be able to transitively
>> >>>>> resolve
>> >>>>> > >>>> any dependencies defined in profiles.
This has significant
>> >>>>> > >>>> implications to user code that depends
on Accumulo Maven
>> artifacts.
>> >>>>> > >>>> Every user will essentially have to explicitly
add Hadoop
>> >>>>> dependencies
>> >>>>> > >>>> for every Accumulo artifact that has dependencies
on Hadoop,
>> either
>> >>>>> > >>>> because we directly or transitively depend
on Hadoop (they'll
>> have
>> >>>>> to
>> >>>>> > >>>> peek into the profiles in our POMs and
copy/paste the profile
>> into
>> >>>>> > >>>> their project). This becomes more complicated
when we
>> consider how
>> >>>>> > >>>> users will try to use things like Instamo.
>> >>>>> > >>>>
>> >>>>> > >>>> There are workarounds, but none of them
are really pleasant.
>> >>>>> > >>>>
>> >>>>> > >>>> 1. The best way to support both major
Hadoop APIs is to have
>> >>>>> separate
>> >>>>> > >>>> modules with separate dependencies directly
in the POM. This
>> is a
>> >>>>> fair
>> >>>>> > >>>> amount of work, and in my opinion, would
be too disruptive for
>> >>>>> 1.5.0.
>> >>>>> > >>>> This solution also gets us separate binaries
for separate
>> supported
>> >>>>> > >>>> versions, which is useful.
>> >>>>> > >>>>
>> >>>>> > >>>> 2. A second option, and the preferred
one I think for 1.5.0,
>> is to
>> >>>>> put
>> >>>>> > >>>> a Hadoop2 patch in the branch's contrib
directory
>> >>>>> > >>>> (branches/1.5/contrib) that patches the
POM files to support
>> >>>>> building
>> >>>>> > >>>> against Hadoop2. (Acknowledgement to Keith
for suggesting this
>> >>>>> > >>>> solution.)
>> >>>>> > >>>>
>> >>>>> > >>>> 3. A third option is to fork Accumulo,
and maintain two
>> separate
>> >>>>> > >>>> builds (a more traditional technique).
This adds merging
>> nightmare
>> >>>>> for
>> >>>>> > >>>> features/patches, but gets around some
reflection hacks that
>> we may
>> >>>>> > >>>> have been motivated to do in the past.
I'm not a fan of this
>> option,
>> >>>>> > >>>> particularly because I don't want to replicate
the fork
>> nightmare
>> >>>>> that
>> >>>>> > >>>> has been the history of early Hadoop itself.
>> >>>>> > >>>>
>> >>>>> > >>>> 4. The last option is to do nothing and
to continue to build
>> with
>> >>>>> the
>> >>>>> > >>>> separate profiles as we are, and make
users discover and
>> specify
>> >>>>> > >>>> transitive dependencies entirely on their
own. I think this
>> is the
>> >>>>> > >>>> worst option, as it essentially amounts
to "ignore the
>> problem".
>> >>>>> > >>>>
>> >>>>> > >>>> At the very least, it does not seem reasonable
to complete
>> >>>>> > >>>> ACCUMULO-1402 for 1.5.0, given the complexity
of this issue.
>> >>>>> > >>>>
>> >>>>> > >>>> Thoughts? Discussion? Vote on option?
>> >>>>> > >>>>
>> >>>>> > >>>> --
>> >>>>> > >>>> Christopher L Tubbs II
>> >>>>> > >>>> http://gravatar.com/ctubbsii
>> >>>>> > >>>>
>> >>>>> >
>> >>>>>
>>

Mime
View raw message