accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Vines <vi...@apache.org>
Subject Re: Hadoop 2 compatibility issues
Date Tue, 14 May 2013 22:16:48 GMT
They're the same currently. I was requesting separate gavs for hadoop 2.
It's been on the mailing list and jira.

Sent from my phone, please pardon the typos and brevity.
On May 14, 2013 6:14 PM, "Keith Turner" <keith@deenlo.com> wrote:

> On Tue, May 14, 2013 at 5:51 PM, Benson Margulies <bimargulies@gmail.com
> >wrote:
>
> > I am a maven developer, and I'm offering this advice based on my
> > understanding of reason why that generic advice is offered.
> >
> > If you have different profiles that _build different results_ but all
> > deliver the same GAV, you have chaos.
> >
>
> What GAV are we currently producing for hadoop 1 and hadoop 2?
>
>
> >
> > If you have different profiles that test against different versions of
> > dependencies, but all deliver the same byte code at the end of the
> > day, you don't have chaos.
> >
> >
> >
> > On Tue, May 14, 2013 at 5:48 PM, Christopher <ctubbsii@apache.org>
> wrote:
> > > I think it's interesting that Option 4 seems to be most preferred...
> > > because it's the *only* option that is explicitly advised against by
> > > the Maven developers (from the information I've read). I can see its
> > > appeal, but I really don't think that we should introduce an explicit
> > > problem for users (that applies to users using even the Hadoop version
> > > we directly build against... not just those using Hadoop 2... I don't
> > > know if that point was clear), to only partially support a version of
> > > Hadoop that is still alpha and has never had a stable release.
> > >
> > > BTW, Option 4 was how I had have achieved a solution for
> > > ACCUMULO-1402, but am reluctant to apply that patch, with this issue
> > > outstanding, as it may exacerbate the problem.
> > >
> > > Another implication for Option 4 (the current "solution") is for
> > > 1.6.0, with the planned accumulo-maven-plugin... because it means that
> > > the accumulo-maven-plugin will need to be configured like this:
> > > <plugin>
> > >   <groupId>org.apache.accumulo</groupId>
> > >   <artifactId>accumulo-maven-plugin</artifactId>
> > >   <dependencies>
> > >    ... all the required hadoop 1 dependencies to make the plugin work,
> > > even though this version only works against hadoop 1 anyway...
> > >   </dependencies>
> > >   ...
> > > </plugin>
> > >
> > > --
> > > Christopher L Tubbs II
> > > http://gravatar.com/ctubbsii
> > >
> > >
> > > On Tue, May 14, 2013 at 5:42 PM, Christopher <ctubbsii@apache.org>
> > wrote:
> > >> I think Option 2 is the best solution for "waiting until we have the
> > >> time to solve the problem correctly", as it ensures that transitive
> > >> dependencies work for the stable version of Hadoop, and using Hadoop2
> > >> is a very simple documentation issue for how to apply the patch and
> > >> rebuild. Option 4 doesn't wait... it explicitly introduces a problem
> > >> for users.
> > >>
> > >> Option 1 is how I'm tentatively thinking about fixing it properly in
> > 1.6.0.
> > >>
> > >>
> > >> --
> > >> Christopher L Tubbs II
> > >> http://gravatar.com/ctubbsii
> > >>
> > >>
> > >> On Tue, May 14, 2013 at 4:56 PM, John Vines <vines@apache.org> wrote:
> > >>> I'm an advocate of option 4. You say that it's ignoring the problem,
> > >>> whereas I think it's waiting until we have the time to solve the
> > problem
> > >>> correctly. Your reasoning for this is for standardizing for maven
> > >>> conventions, but the other options, while more 'correct' from a maven
> > >>> standpoint or a larger headache for our user base and ourselves. In
> > either
> > >>> case, we're going to be breaking some sort of convention, and while
> > it's
> > >>> not good, we should be doing the one that's less bad for US. The
> > important
> > >>> thing here, now, is that the poms work and we should go with the
> method
> > >>> that leaves the work minimal for our end users to utilize them.
> > >>>
> > >>> I do agree that 1. is the correct option in the long run. More
> > >>> specifically, I think it boils down to having a single module
> > compatibility
> > >>> layer, which is how hbase deals with this issue. But like you said,
> we
> > >>> don't have the time to engineer a proper solution. So let sleeping
> > dogs lie
> > >>> and we can revamp the whole system for 1.5.1 or 1.6.0 when we have
> the
> > >>> cycles to do it right.
> > >>>
> > >>>
> > >>> On Tue, May 14, 2013 at 4:40 PM, Christopher <ctubbsii@apache.org>
> > wrote:
> > >>>
> > >>>> So, I've run into a problem with ACCUMULO-1402 that requires a
> larger
> > >>>> discussion about how Accumulo 1.5.0 should support Hadoop2.
> > >>>>
> > >>>> The problem is basically that profiles should not contain
> > >>>> dependencies, because profiles don't get activated transitively.
A
> > >>>> slide deck by the Maven developers point this out as a bad
> practice...
> > >>>> yet it's a practice we rely on for our current implementation of
> > >>>> Hadoop2 support
> > >>>> (
> http://www.slideshare.net/aheritier/geneva-jug-30th-march-2010-maven
> > >>>> slide 80).
> > >>>>
> > >>>> What this means is that even if we go through the work of publishing
> > >>>> binary artifacts compiled against Hadoop2, neither our Hadoop1
> > >>>> binaries or our Hadoop2 binaries will be able to transitively
> resolve
> > >>>> any dependencies defined in profiles. This has significant
> > >>>> implications to user code that depends on Accumulo Maven artifacts.
> > >>>> Every user will essentially have to explicitly add Hadoop
> dependencies
> > >>>> for every Accumulo artifact that has dependencies on Hadoop, either
> > >>>> because we directly or transitively depend on Hadoop (they'll have
> to
> > >>>> peek into the profiles in our POMs and copy/paste the profile into
> > >>>> their project). This becomes more complicated when we consider
how
> > >>>> users will try to use things like Instamo.
> > >>>>
> > >>>> There are workarounds, but none of them are really pleasant.
> > >>>>
> > >>>> 1. The best way to support both major Hadoop APIs is to have
> separate
> > >>>> modules with separate dependencies directly in the POM. This is
a
> fair
> > >>>> amount of work, and in my opinion, would be too disruptive for
> 1.5.0.
> > >>>> This solution also gets us separate binaries for separate supported
> > >>>> versions, which is useful.
> > >>>>
> > >>>> 2. A second option, and the preferred one I think for 1.5.0, is
to
> put
> > >>>> a Hadoop2 patch in the branch's contrib directory
> > >>>> (branches/1.5/contrib) that patches the POM files to support
> building
> > >>>> against Hadoop2. (Acknowledgement to Keith for suggesting this
> > >>>> solution.)
> > >>>>
> > >>>> 3. A third option is to fork Accumulo, and maintain two separate
> > >>>> builds (a more traditional technique). This adds merging nightmare
> for
> > >>>> features/patches, but gets around some reflection hacks that we
may
> > >>>> have been motivated to do in the past. I'm not a fan of this option,
> > >>>> particularly because I don't want to replicate the fork nightmare
> that
> > >>>> has been the history of early Hadoop itself.
> > >>>>
> > >>>> 4. The last option is to do nothing and to continue to build with
> the
> > >>>> separate profiles as we are, and make users discover and specify
> > >>>> transitive dependencies entirely on their own. I think this is
the
> > >>>> worst option, as it essentially amounts to "ignore the problem".
> > >>>>
> > >>>> At the very least, it does not seem reasonable to complete
> > >>>> ACCUMULO-1402 for 1.5.0, given the complexity of this issue.
> > >>>>
> > >>>> Thoughts? Discussion? Vote on option?
> > >>>>
> > >>>> --
> > >>>> Christopher L Tubbs II
> > >>>> http://gravatar.com/ctubbsii
> > >>>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message