accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Vines <vi...@apache.org>
Subject Re: Hadoop 2 compatibility issues
Date Tue, 14 May 2013 23:50:19 GMT
The compiled code is compiled code. There are no concerns of dependency
resolution. So I see no issues in using the profile to define the gav if
that is feasible.

Sent from my phone, please pardon the typos and brevity.
On May 14, 2013 7:47 PM, "Christopher" <ctubbsii@apache.org> wrote:

> Response to Benson inline, but additional note here:
>
> It should be noted that the situation will be made worse for the
> solution I was considering for ACCUMULO-1402, which would move the
> accumulo artifacts, classified by the hadoop2 variant, into the
> profiles... meaning they will no longer resolve transitively when they
> did before. Can go into details on that ticket, if needed.
>
> On Tue, May 14, 2013 at 7:41 PM, Benson Margulies <bimargulies@gmail.com>
> wrote:
> > On Tue, May 14, 2013 at 7:36 PM, Christopher <ctubbsii@apache.org>
> wrote:
> >> Benson-
> >>
> >> They produce different byte-code. That's why we're even considering
> >> this. ACCUMULO-1402 is the ticket under which our intent is to add
> >> classifiers, so that they can be distinguished.
> >
> > whoops, missed that.
> >
> > Then how do people succeed in just fixing up their dependencies and
> using it?
>
> The specific differences are things like changes from abstract class
> to an interface. Apparently an import of these do not produce
> compatible byte-code, even though the method signature looks the same.
>
> > In any case, speaking as a Maven-maven, classifiers are absolutely,
> > positively, a cure worse than the disease. If you want the details
> > just ask.
>
> Agreed. I just don't see a good alternative here.
>
> >>
> >> All-
> >>
> >> To Keith's point, I think perhaps all this concern is a non-issue...
> >> because as Keith points out, the dependencies in question are marked
> >> as "provided", and dependency resolution doesn't occur for provided
> >> dependencies anyway... so even if we leave off the profiles, we're in
> >> the same boat. Maybe not the boat we should be in... but certainly not
> >> a sinking one as I had first imagined. It's as afloat as it was
> >> before, when they were not in a profile, but still marked as
> >> "provided".
> >>
> >> --
> >> Christopher L Tubbs II
> >> http://gravatar.com/ctubbsii
> >>
> >>
> >> On Tue, May 14, 2013 at 7:09 PM, Benson Margulies <
> bimargulies@gmail.com> wrote:
> >>> I just doesn't make very much sense to me to have two different GAV's
> >>> for the very same .class files, just to get different dependencies in
> >>> the poms. However, if someone really wanted that, I'd look to make
> >>> some scripting that created this downstream from the main build.
> >>>
> >>>
> >>> On Tue, May 14, 2013 at 6:16 PM, John Vines <vines@apache.org> wrote:
> >>>> They're the same currently. I was requesting separate gavs for hadoop
> 2.
> >>>> It's been on the mailing list and jira.
> >>>>
> >>>> Sent from my phone, please pardon the typos and brevity.
> >>>> On May 14, 2013 6:14 PM, "Keith Turner" <keith@deenlo.com> wrote:
> >>>>
> >>>>> On Tue, May 14, 2013 at 5:51 PM, Benson Margulies <
> bimargulies@gmail.com
> >>>>> >wrote:
> >>>>>
> >>>>> > I am a maven developer, and I'm offering this advice based
on my
> >>>>> > understanding of reason why that generic advice is offered.
> >>>>> >
> >>>>> > If you have different profiles that _build different results_
but
> all
> >>>>> > deliver the same GAV, you have chaos.
> >>>>> >
> >>>>>
> >>>>> What GAV are we currently producing for hadoop 1 and hadoop 2?
> >>>>>
> >>>>>
> >>>>> >
> >>>>> > If you have different profiles that test against different
> versions of
> >>>>> > dependencies, but all deliver the same byte code at the end
of the
> >>>>> > day, you don't have chaos.
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tue, May 14, 2013 at 5:48 PM, Christopher <ctubbsii@apache.org>
> >>>>> wrote:
> >>>>> > > I think it's interesting that Option 4 seems to be most
> preferred...
> >>>>> > > because it's the *only* option that is explicitly advised
> against by
> >>>>> > > the Maven developers (from the information I've read).
I can see
> its
> >>>>> > > appeal, but I really don't think that we should introduce
an
> explicit
> >>>>> > > problem for users (that applies to users using even the
Hadoop
> version
> >>>>> > > we directly build against... not just those using Hadoop
2... I
> don't
> >>>>> > > know if that point was clear), to only partially support
a
> version of
> >>>>> > > Hadoop that is still alpha and has never had a stable
release.
> >>>>> > >
> >>>>> > > BTW, Option 4 was how I had have achieved a solution for
> >>>>> > > ACCUMULO-1402, but am reluctant to apply that patch, with
this
> issue
> >>>>> > > outstanding, as it may exacerbate the problem.
> >>>>> > >
> >>>>> > > Another implication for Option 4 (the current "solution")
is for
> >>>>> > > 1.6.0, with the planned accumulo-maven-plugin... because
it
> means that
> >>>>> > > the accumulo-maven-plugin will need to be configured like
this:
> >>>>> > > <plugin>
> >>>>> > >   <groupId>org.apache.accumulo</groupId>
> >>>>> > >   <artifactId>accumulo-maven-plugin</artifactId>
> >>>>> > >   <dependencies>
> >>>>> > >    ... all the required hadoop 1 dependencies to make
the plugin
> work,
> >>>>> > > even though this version only works against hadoop 1 anyway...
> >>>>> > >   </dependencies>
> >>>>> > >   ...
> >>>>> > > </plugin>
> >>>>> > >
> >>>>> > > --
> >>>>> > > Christopher L Tubbs II
> >>>>> > > http://gravatar.com/ctubbsii
> >>>>> > >
> >>>>> > >
> >>>>> > > On Tue, May 14, 2013 at 5:42 PM, Christopher <
> ctubbsii@apache.org>
> >>>>> > wrote:
> >>>>> > >> I think Option 2 is the best solution for "waiting
until we
> have the
> >>>>> > >> time to solve the problem correctly", as it ensures
that
> transitive
> >>>>> > >> dependencies work for the stable version of Hadoop,
and using
> Hadoop2
> >>>>> > >> is a very simple documentation issue for how to apply
the patch
> and
> >>>>> > >> rebuild. Option 4 doesn't wait... it explicitly introduces
a
> problem
> >>>>> > >> for users.
> >>>>> > >>
> >>>>> > >> Option 1 is how I'm tentatively thinking about fixing
it
> properly in
> >>>>> > 1.6.0.
> >>>>> > >>
> >>>>> > >>
> >>>>> > >> --
> >>>>> > >> Christopher L Tubbs II
> >>>>> > >> http://gravatar.com/ctubbsii
> >>>>> > >>
> >>>>> > >>
> >>>>> > >> On Tue, May 14, 2013 at 4:56 PM, John Vines <vines@apache.org>
> wrote:
> >>>>> > >>> I'm an advocate of option 4. You say that it's
ignoring the
> problem,
> >>>>> > >>> whereas I think it's waiting until we have the
time to solve
> the
> >>>>> > problem
> >>>>> > >>> correctly. Your reasoning for this is for standardizing
for
> maven
> >>>>> > >>> conventions, but the other options, while more
'correct' from
> a maven
> >>>>> > >>> standpoint or a larger headache for our user base
and
> ourselves. In
> >>>>> > either
> >>>>> > >>> case, we're going to be breaking some sort of
convention, and
> while
> >>>>> > it's
> >>>>> > >>> not good, we should be doing the one that's less
bad for US.
> The
> >>>>> > important
> >>>>> > >>> thing here, now, is that the poms work and we
should go with
> the
> >>>>> method
> >>>>> > >>> that leaves the work minimal for our end users
to utilize them.
> >>>>> > >>>
> >>>>> > >>> I do agree that 1. is the correct option in the
long run. More
> >>>>> > >>> specifically, I think it boils down to having
a single module
> >>>>> > compatibility
> >>>>> > >>> layer, which is how hbase deals with this issue.
But like you
> said,
> >>>>> we
> >>>>> > >>> don't have the time to engineer a proper solution.
So let
> sleeping
> >>>>> > dogs lie
> >>>>> > >>> and we can revamp the whole system for 1.5.1 or
1.6.0 when we
> have
> >>>>> the
> >>>>> > >>> cycles to do it right.
> >>>>> > >>>
> >>>>> > >>>
> >>>>> > >>> On Tue, May 14, 2013 at 4:40 PM, Christopher <
> ctubbsii@apache.org>
> >>>>> > wrote:
> >>>>> > >>>
> >>>>> > >>>> So, I've run into a problem with ACCUMULO-1402
that requires a
> >>>>> larger
> >>>>> > >>>> discussion about how Accumulo 1.5.0 should
support Hadoop2.
> >>>>> > >>>>
> >>>>> > >>>> The problem is basically that profiles should
not contain
> >>>>> > >>>> dependencies, because profiles don't get activated
> transitively. A
> >>>>> > >>>> slide deck by the Maven developers point this
out as a bad
> >>>>> practice...
> >>>>> > >>>> yet it's a practice we rely on for our current
implementation
> of
> >>>>> > >>>> Hadoop2 support
> >>>>> > >>>> (
> >>>>> http://www.slideshare.net/aheritier/geneva-jug-30th-march-2010-maven
> >>>>> > >>>> slide 80).
> >>>>> > >>>>
> >>>>> > >>>> What this means is that even if we go through
the work of
> publishing
> >>>>> > >>>> binary artifacts compiled against Hadoop2,
neither our Hadoop1
> >>>>> > >>>> binaries or our Hadoop2 binaries will be able
to transitively
> >>>>> resolve
> >>>>> > >>>> any dependencies defined in profiles. This
has significant
> >>>>> > >>>> implications to user code that depends on
Accumulo Maven
> artifacts.
> >>>>> > >>>> Every user will essentially have to explicitly
add Hadoop
> >>>>> dependencies
> >>>>> > >>>> for every Accumulo artifact that has dependencies
on Hadoop,
> either
> >>>>> > >>>> because we directly or transitively depend
on Hadoop (they'll
> have
> >>>>> to
> >>>>> > >>>> peek into the profiles in our POMs and copy/paste
the profile
> into
> >>>>> > >>>> their project). This becomes more complicated
when we
> consider how
> >>>>> > >>>> users will try to use things like Instamo.
> >>>>> > >>>>
> >>>>> > >>>> There are workarounds, but none of them are
really pleasant.
> >>>>> > >>>>
> >>>>> > >>>> 1. The best way to support both major Hadoop
APIs is to have
> >>>>> separate
> >>>>> > >>>> modules with separate dependencies directly
in the POM. This
> is a
> >>>>> fair
> >>>>> > >>>> amount of work, and in my opinion, would be
too disruptive for
> >>>>> 1.5.0.
> >>>>> > >>>> This solution also gets us separate binaries
for separate
> supported
> >>>>> > >>>> versions, which is useful.
> >>>>> > >>>>
> >>>>> > >>>> 2. A second option, and the preferred one
I think for 1.5.0,
> is to
> >>>>> put
> >>>>> > >>>> a Hadoop2 patch in the branch's contrib directory
> >>>>> > >>>> (branches/1.5/contrib) that patches the POM
files to support
> >>>>> building
> >>>>> > >>>> against Hadoop2. (Acknowledgement to Keith
for suggesting this
> >>>>> > >>>> solution.)
> >>>>> > >>>>
> >>>>> > >>>> 3. A third option is to fork Accumulo, and
maintain two
> separate
> >>>>> > >>>> builds (a more traditional technique). This
adds merging
> nightmare
> >>>>> for
> >>>>> > >>>> features/patches, but gets around some reflection
hacks that
> we may
> >>>>> > >>>> have been motivated to do in the past. I'm
not a fan of this
> option,
> >>>>> > >>>> particularly because I don't want to replicate
the fork
> nightmare
> >>>>> that
> >>>>> > >>>> has been the history of early Hadoop itself.
> >>>>> > >>>>
> >>>>> > >>>> 4. The last option is to do nothing and to
continue to build
> with
> >>>>> the
> >>>>> > >>>> separate profiles as we are, and make users
discover and
> specify
> >>>>> > >>>> transitive dependencies entirely on their
own. I think this
> is the
> >>>>> > >>>> worst option, as it essentially amounts to
"ignore the
> problem".
> >>>>> > >>>>
> >>>>> > >>>> At the very least, it does not seem reasonable
to complete
> >>>>> > >>>> ACCUMULO-1402 for 1.5.0, given the complexity
of this issue.
> >>>>> > >>>>
> >>>>> > >>>> Thoughts? Discussion? Vote on option?
> >>>>> > >>>>
> >>>>> > >>>> --
> >>>>> > >>>> Christopher L Tubbs II
> >>>>> > >>>> http://gravatar.com/ctubbsii
> >>>>> > >>>>
> >>>>> >
> >>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message