accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: [DISCUSS] packaging our dependencies
Date Sun, 11 May 2014 22:07:28 GMT
In general, I think this is reasonable... especially because Hadoop
Client stabilizes things a bit. On the other hand, things get really
complicated with dependencies in the pom (somewhat complicated), and
packaged dependencies (more complicated), when we're talking about
supporting both Hadoop 1 and Hadoop 2. I know some of us want to drop
Hadoop 1 support in 2.0.0, and I think this is one more good reason to
do that.

Another data point that I think is going to complicate things a (very)
tiny bit: the work on ACCUMULO-2589 includes things like: drop the
dependencies on Hadoop from the API. But, we're likely to still have a
dependency on guava (there was a suggestion to use guava's @Beta
annotations in the API). Maybe this is fine.... because the packaging
considerations for the binary tarball are not the same as the API
module dependencies (though they'll have to be compatible), but it's
something to consider.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Sun, May 11, 2014 at 4:45 PM, Sean Busbey <busbey@cloudera.com> wrote:
> ACCUMULO-2786 has brought up the issue of what dependencies we bring with
> Accumulo rather than depend on the environment providing[1].
>
> Christopher explains our extant reasoning thus
>
>> The precedent has been: if vanilla Apache Hadoop provides it in its bin
> tarball, we don't need to.
>
> I'd like us to move to packaging any dependencies that aren't brought in by
> Hadoop Client.
>
> 1) Our existing practice developed before Hadoop Client existed, so we
> essentially *had* to have all of the Hadoop related deps on our classpath.
> For versions where we default to Hadoop 2, we can improve things.
>
> 2) We should encourage users to follow good practice by minimizing the
> number of jars added to the classpath.
>
> 3) We have to still include the jars found in Hadoop Client because we use
> hadoop.
>
> 4) Limiting the dependencies we rely on external sources to provide allows
> us to update more of our dependencies to current versions.
>
> 5) Minimizing the number of jars we rely on from external sources reduces
> the chances that they change out from under us (and thus reduces the number
> of external factors we have to remain cognizant of)
>
> 6) Minimizing the classpath reduces the chances of having multiple
> different versions of the same library present.
>
> I'd also like for us to *not* package any of the jars brought in by Hadoop
> Client. Due to the additional work it would take to downgrade our version
> of guava, I'd like to wait to do that.
>
> [1]: https://issues.apache.org/jira/browse/ACCUMULO-2786
>
> --
> Sean

Mime
View raw message