accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Berman <>
Subject Re: "Provided" dependencies
Date Wed, 06 Nov 2013 22:43:09 GMT
I think it would be nice to separate what client API users need from the
the provided dependencies issue.  It seems like whatever module client
projects depend on should itself only have dependencies on things that it
actually needs.  If it doesn't need hadoop, then it shouldn't declare it as
a dependency at all.  The hadoop-dependent server and the
hadoop-independent client interface both need to share intermediate
objects, but it seems like those could be defined in another, common
hadoop-independent module.

In/Outputformats are an exception, but I agree they would be best separated
into their own hadoop-dependent module (which might itself depend on the
client module).

As far as the provided question goes, it seems to me that the only reason
to mark a dep provided is if we think developers will *usually* want to
compile against different versions.  Initially I thought it would make
sense if we thought the runtime versions would vary, but Chris makes a good
point that the deps we include in the distributed package can be selected
independently of the maven dep scope.  Since you can build accumulo against
any version of hadoop and it will still run against any other version of
hadoop, I think it's better to make things easier on us by having it
compile scoped.

If someone depends on the accumulo server, then they may have to exclude
the transitive dependency if our hadoop is polluting theirs, but I think
that issue can be mitigated by not requiring client apps to depend on the
entire server.

On Wed, Nov 6, 2013 at 5:17 PM, Joey Echeverria <>wrote:

> Do Accumulo users need Hadoop or it's dependencies in order to use the
> client APIs?
> The only client API that I could see needing it would be the
> [In|Out]putFormats, but it'd be cool if that was a separate module and
> that module had the appropriate Hadoop dependencies with the compile
> scope.
> -Joey
> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <> wrote:
> > What's the latest opinion whether things should be marked "provided" in
> the pom?
> > I've changed my mind on this a few times, myself, so I'm curious what
> > others think.
> >
> > The provided scope means that it will not propagate as a transitive
> > dependency. Other than that, it doesn't do much... though we can
> > control packaging based on provided or not.
> >
> > I'm not sure this gets us much, and it's inconvenient for users. We
> > can control packaging in other ways (like being more explicit and
> > carefully considering which dependencies we include in an RPM or
> > tarball, for instance).
> >
> > If we drop its declaration, what this means, is that if users want to
> > build with Accumulo as a dependency, but against a different version
> > of Hadoop than what we declare in our POM, they'll have to explicitly
> > <exclude> the hadoop dependencies, and redeclare them, or they will
> > have to use their <dependencyManagement> section to force a particular
> > dependency of hadoop.
> >
> > The advantage to users, though, if we drop this, is that they won't
> > have to constantly re-declare transitive dependencies to get their
> > projects to build/test/run.
> >
> > See
> >
> > Thoughts?
> >
> > --
> > Christopher L Tubbs II
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message