accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: Hadoop 2 compatibility issues
Date Tue, 14 May 2013 23:47:08 GMT
Response to Benson inline, but additional note here:

It should be noted that the situation will be made worse for the
solution I was considering for ACCUMULO-1402, which would move the
accumulo artifacts, classified by the hadoop2 variant, into the
profiles... meaning they will no longer resolve transitively when they
did before. Can go into details on that ticket, if needed.

On Tue, May 14, 2013 at 7:41 PM, Benson Margulies <bimargulies@gmail.com> wrote:
> On Tue, May 14, 2013 at 7:36 PM, Christopher <ctubbsii@apache.org> wrote:
>> Benson-
>>
>> They produce different byte-code. That's why we're even considering
>> this. ACCUMULO-1402 is the ticket under which our intent is to add
>> classifiers, so that they can be distinguished.
>
> whoops, missed that.
>
> Then how do people succeed in just fixing up their dependencies and using it?

The specific differences are things like changes from abstract class
to an interface. Apparently an import of these do not produce
compatible byte-code, even though the method signature looks the same.

> In any case, speaking as a Maven-maven, classifiers are absolutely,
> positively, a cure worse than the disease. If you want the details
> just ask.

Agreed. I just don't see a good alternative here.

>>
>> All-
>>
>> To Keith's point, I think perhaps all this concern is a non-issue...
>> because as Keith points out, the dependencies in question are marked
>> as "provided", and dependency resolution doesn't occur for provided
>> dependencies anyway... so even if we leave off the profiles, we're in
>> the same boat. Maybe not the boat we should be in... but certainly not
>> a sinking one as I had first imagined. It's as afloat as it was
>> before, when they were not in a profile, but still marked as
>> "provided".
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>>
>> On Tue, May 14, 2013 at 7:09 PM, Benson Margulies <bimargulies@gmail.com> wrote:
>>> I just doesn't make very much sense to me to have two different GAV's
>>> for the very same .class files, just to get different dependencies in
>>> the poms. However, if someone really wanted that, I'd look to make
>>> some scripting that created this downstream from the main build.
>>>
>>>
>>> On Tue, May 14, 2013 at 6:16 PM, John Vines <vines@apache.org> wrote:
>>>> They're the same currently. I was requesting separate gavs for hadoop 2.
>>>> It's been on the mailing list and jira.
>>>>
>>>> Sent from my phone, please pardon the typos and brevity.
>>>> On May 14, 2013 6:14 PM, "Keith Turner" <keith@deenlo.com> wrote:
>>>>
>>>>> On Tue, May 14, 2013 at 5:51 PM, Benson Margulies <bimargulies@gmail.com
>>>>> >wrote:
>>>>>
>>>>> > I am a maven developer, and I'm offering this advice based on my
>>>>> > understanding of reason why that generic advice is offered.
>>>>> >
>>>>> > If you have different profiles that _build different results_ but
all
>>>>> > deliver the same GAV, you have chaos.
>>>>> >
>>>>>
>>>>> What GAV are we currently producing for hadoop 1 and hadoop 2?
>>>>>
>>>>>
>>>>> >
>>>>> > If you have different profiles that test against different versions
of
>>>>> > dependencies, but all deliver the same byte code at the end of the
>>>>> > day, you don't have chaos.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, May 14, 2013 at 5:48 PM, Christopher <ctubbsii@apache.org>
>>>>> wrote:
>>>>> > > I think it's interesting that Option 4 seems to be most preferred...
>>>>> > > because it's the *only* option that is explicitly advised against
by
>>>>> > > the Maven developers (from the information I've read). I can
see its
>>>>> > > appeal, but I really don't think that we should introduce an
explicit
>>>>> > > problem for users (that applies to users using even the Hadoop
version
>>>>> > > we directly build against... not just those using Hadoop 2...
I don't
>>>>> > > know if that point was clear), to only partially support a
version of
>>>>> > > Hadoop that is still alpha and has never had a stable release.
>>>>> > >
>>>>> > > BTW, Option 4 was how I had have achieved a solution for
>>>>> > > ACCUMULO-1402, but am reluctant to apply that patch, with this
issue
>>>>> > > outstanding, as it may exacerbate the problem.
>>>>> > >
>>>>> > > Another implication for Option 4 (the current "solution") is
for
>>>>> > > 1.6.0, with the planned accumulo-maven-plugin... because it
means that
>>>>> > > the accumulo-maven-plugin will need to be configured like this:
>>>>> > > <plugin>
>>>>> > >   <groupId>org.apache.accumulo</groupId>
>>>>> > >   <artifactId>accumulo-maven-plugin</artifactId>
>>>>> > >   <dependencies>
>>>>> > >    ... all the required hadoop 1 dependencies to make the plugin
work,
>>>>> > > even though this version only works against hadoop 1 anyway...
>>>>> > >   </dependencies>
>>>>> > >   ...
>>>>> > > </plugin>
>>>>> > >
>>>>> > > --
>>>>> > > Christopher L Tubbs II
>>>>> > > http://gravatar.com/ctubbsii
>>>>> > >
>>>>> > >
>>>>> > > On Tue, May 14, 2013 at 5:42 PM, Christopher <ctubbsii@apache.org>
>>>>> > wrote:
>>>>> > >> I think Option 2 is the best solution for "waiting until
we have the
>>>>> > >> time to solve the problem correctly", as it ensures that
transitive
>>>>> > >> dependencies work for the stable version of Hadoop, and
using Hadoop2
>>>>> > >> is a very simple documentation issue for how to apply the
patch and
>>>>> > >> rebuild. Option 4 doesn't wait... it explicitly introduces
a problem
>>>>> > >> for users.
>>>>> > >>
>>>>> > >> Option 1 is how I'm tentatively thinking about fixing it
properly in
>>>>> > 1.6.0.
>>>>> > >>
>>>>> > >>
>>>>> > >> --
>>>>> > >> Christopher L Tubbs II
>>>>> > >> http://gravatar.com/ctubbsii
>>>>> > >>
>>>>> > >>
>>>>> > >> On Tue, May 14, 2013 at 4:56 PM, John Vines <vines@apache.org>
wrote:
>>>>> > >>> I'm an advocate of option 4. You say that it's ignoring
the problem,
>>>>> > >>> whereas I think it's waiting until we have the time
to solve the
>>>>> > problem
>>>>> > >>> correctly. Your reasoning for this is for standardizing
for maven
>>>>> > >>> conventions, but the other options, while more 'correct'
from a maven
>>>>> > >>> standpoint or a larger headache for our user base and
ourselves. In
>>>>> > either
>>>>> > >>> case, we're going to be breaking some sort of convention,
and while
>>>>> > it's
>>>>> > >>> not good, we should be doing the one that's less bad
for US. The
>>>>> > important
>>>>> > >>> thing here, now, is that the poms work and we should
go with the
>>>>> method
>>>>> > >>> that leaves the work minimal for our end users to utilize
them.
>>>>> > >>>
>>>>> > >>> I do agree that 1. is the correct option in the long
run. More
>>>>> > >>> specifically, I think it boils down to having a single
module
>>>>> > compatibility
>>>>> > >>> layer, which is how hbase deals with this issue. But
like you said,
>>>>> we
>>>>> > >>> don't have the time to engineer a proper solution.
So let sleeping
>>>>> > dogs lie
>>>>> > >>> and we can revamp the whole system for 1.5.1 or 1.6.0
when we have
>>>>> the
>>>>> > >>> cycles to do it right.
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> On Tue, May 14, 2013 at 4:40 PM, Christopher <ctubbsii@apache.org>
>>>>> > wrote:
>>>>> > >>>
>>>>> > >>>> So, I've run into a problem with ACCUMULO-1402
that requires a
>>>>> larger
>>>>> > >>>> discussion about how Accumulo 1.5.0 should support
Hadoop2.
>>>>> > >>>>
>>>>> > >>>> The problem is basically that profiles should not
contain
>>>>> > >>>> dependencies, because profiles don't get activated
transitively. A
>>>>> > >>>> slide deck by the Maven developers point this out
as a bad
>>>>> practice...
>>>>> > >>>> yet it's a practice we rely on for our current
implementation of
>>>>> > >>>> Hadoop2 support
>>>>> > >>>> (
>>>>> http://www.slideshare.net/aheritier/geneva-jug-30th-march-2010-maven
>>>>> > >>>> slide 80).
>>>>> > >>>>
>>>>> > >>>> What this means is that even if we go through the
work of publishing
>>>>> > >>>> binary artifacts compiled against Hadoop2, neither
our Hadoop1
>>>>> > >>>> binaries or our Hadoop2 binaries will be able to
transitively
>>>>> resolve
>>>>> > >>>> any dependencies defined in profiles. This has
significant
>>>>> > >>>> implications to user code that depends on Accumulo
Maven artifacts.
>>>>> > >>>> Every user will essentially have to explicitly
add Hadoop
>>>>> dependencies
>>>>> > >>>> for every Accumulo artifact that has dependencies
on Hadoop, either
>>>>> > >>>> because we directly or transitively depend on Hadoop
(they'll have
>>>>> to
>>>>> > >>>> peek into the profiles in our POMs and copy/paste
the profile into
>>>>> > >>>> their project). This becomes more complicated when
we consider how
>>>>> > >>>> users will try to use things like Instamo.
>>>>> > >>>>
>>>>> > >>>> There are workarounds, but none of them are really
pleasant.
>>>>> > >>>>
>>>>> > >>>> 1. The best way to support both major Hadoop APIs
is to have
>>>>> separate
>>>>> > >>>> modules with separate dependencies directly in
the POM. This is a
>>>>> fair
>>>>> > >>>> amount of work, and in my opinion, would be too
disruptive for
>>>>> 1.5.0.
>>>>> > >>>> This solution also gets us separate binaries for
separate supported
>>>>> > >>>> versions, which is useful.
>>>>> > >>>>
>>>>> > >>>> 2. A second option, and the preferred one I think
for 1.5.0, is to
>>>>> put
>>>>> > >>>> a Hadoop2 patch in the branch's contrib directory
>>>>> > >>>> (branches/1.5/contrib) that patches the POM files
to support
>>>>> building
>>>>> > >>>> against Hadoop2. (Acknowledgement to Keith for
suggesting this
>>>>> > >>>> solution.)
>>>>> > >>>>
>>>>> > >>>> 3. A third option is to fork Accumulo, and maintain
two separate
>>>>> > >>>> builds (a more traditional technique). This adds
merging nightmare
>>>>> for
>>>>> > >>>> features/patches, but gets around some reflection
hacks that we may
>>>>> > >>>> have been motivated to do in the past. I'm not
a fan of this option,
>>>>> > >>>> particularly because I don't want to replicate
the fork nightmare
>>>>> that
>>>>> > >>>> has been the history of early Hadoop itself.
>>>>> > >>>>
>>>>> > >>>> 4. The last option is to do nothing and to continue
to build with
>>>>> the
>>>>> > >>>> separate profiles as we are, and make users discover
and specify
>>>>> > >>>> transitive dependencies entirely on their own.
I think this is the
>>>>> > >>>> worst option, as it essentially amounts to "ignore
the problem".
>>>>> > >>>>
>>>>> > >>>> At the very least, it does not seem reasonable
to complete
>>>>> > >>>> ACCUMULO-1402 for 1.5.0, given the complexity of
this issue.
>>>>> > >>>>
>>>>> > >>>> Thoughts? Discussion? Vote on option?
>>>>> > >>>>
>>>>> > >>>> --
>>>>> > >>>> Christopher L Tubbs II
>>>>> > >>>> http://gravatar.com/ctubbsii
>>>>> > >>>>
>>>>> >
>>>>>

Mime
View raw message