accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joey Echeverria" <>
Subject Re: [DISCUSS] packaging our dependencies
Date Tue, 13 May 2014 00:36:18 GMT
Packaging other jars that had been made available at runtime by virtue of their existence in
the Hadoop directories. 

I'm only talking about dependencies that were/are provided by Hadoop. 

But since you brought up ZooKeeper, my understanding is that ZK intends for dependent projects
to only rely on the ZK jar that is in the top level of the tarball. If you need other jars,
you should package those yourself. WARNING: my info about ZK may be out of date as it's been
a long time since I spoke to the project about how they intend services that rely on it to
be consumed.

On Mon, May 12, 2014 at 7:30 PM, Christopher <> wrote:

> Does that mean package everything else?
> What about ZooKeeper?
> --
> Christopher L Tubbs II
> On Mon, May 12, 2014 at 3:38 PM, Joey Echeverria <> wrote:
>> +1 to only depending on Hadoop client jars.
>> --
>> Joey Echeverria
>> Chief Architect
>> Cloudera Government Solutions
>> On Sun, May 11, 2014 at 6:07 PM, Christopher <> wrote:
>>> In general, I think this is reasonable... especially because Hadoop
>>> Client stabilizes things a bit. On the other hand, things get really
>>> complicated with dependencies in the pom (somewhat complicated), and
>>> packaged dependencies (more complicated), when we're talking about
>>> supporting both Hadoop 1 and Hadoop 2. I know some of us want to drop
>>> Hadoop 1 support in 2.0.0, and I think this is one more good reason to
>>> do that.
>>> Another data point that I think is going to complicate things a (very)
>>> tiny bit: the work on ACCUMULO-2589 includes things like: drop the
>>> dependencies on Hadoop from the API. But, we're likely to still have a
>>> dependency on guava (there was a suggestion to use guava's @Beta
>>> annotations in the API). Maybe this is fine.... because the packaging
>>> considerations for the binary tarball are not the same as the API
>>> module dependencies (though they'll have to be compatible), but it's
>>> something to consider.
>>> --
>>> Christopher L Tubbs II
>>> On Sun, May 11, 2014 at 4:45 PM, Sean Busbey <> wrote:
>>>> ACCUMULO-2786 has brought up the issue of what dependencies we bring with
>>>> Accumulo rather than depend on the environment providing[1].
>>>> Christopher explains our extant reasoning thus
>>>>> The precedent has been: if vanilla Apache Hadoop provides it in its bin
>>>> tarball, we don't need to.
>>>> I'd like us to move to packaging any dependencies that aren't brought in
>>>> Hadoop Client.
>>>> 1) Our existing practice developed before Hadoop Client existed, so we
>>>> essentially *had* to have all of the Hadoop related deps on our classpath.
>>>> For versions where we default to Hadoop 2, we can improve things.
>>>> 2) We should encourage users to follow good practice by minimizing the
>>>> number of jars added to the classpath.
>>>> 3) We have to still include the jars found in Hadoop Client because we use
>>>> hadoop.
>>>> 4) Limiting the dependencies we rely on external sources to provide allows
>>>> us to update more of our dependencies to current versions.
>>>> 5) Minimizing the number of jars we rely on from external sources reduces
>>>> the chances that they change out from under us (and thus reduces the number
>>>> of external factors we have to remain cognizant of)
>>>> 6) Minimizing the classpath reduces the chances of having multiple
>>>> different versions of the same library present.
>>>> I'd also like for us to *not* package any of the jars brought in by Hadoop
>>>> Client. Due to the additional work it would take to downgrade our version
>>>> of guava, I'd like to wait to do that.
>>>> [1]:
>>>> --
>>>> Sean
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message