accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <>
Subject Re: [DISCUSS] Proposed binary packaging changes
Date Fri, 01 Jul 2016 18:25:56 GMT
On Fri, Jul 1, 2016 at 10:44 AM Sean Busbey <> wrote:

> Targeting for 2.0, including updates in the README, and having mean for
> helping
>  the downstream user find the appropriate licensing information makes me
> much
> more comfortable with this.
> I have to ask though, why not just do source only releases? Or source
> + publishing
> the binary jars to maven central needed for the public API?
I'd actually prefer source-only + jars in Maven... but I don't think that
could reach consensus. I figured a more limited approach, still doing
binary tarball but with less bundling had a better chance at getting buy-in.

> On Thu, Jun 30, 2016 at 8:03 PM, Christopher <> wrote:
> >
> > The impetus for this was that I recently bumped our commons-math
> dependency
> > to commons-math3, and it was such a time sink to try to track down even
> > just that one bundled dependencies LICENSE/NOTICE modifications. I
> > seriously doubt our LICENSE/NOTICE files are fully up-to-date and in sync
> > with other bundled deps which have been updated over time.
> >
> This reasoning seems like avoiding the real problem, which seems distinct
> from
> not bundling 3rd party works. It's our job as a community to keep
> accurate track of
> our dependency licensing, even if we don't need to make a document about
> it,
> because we have to ensure that cat-x is kept out*.
As I see it, the problem is an artificial one. Tracking these additional
things are the result of ensuring we document and communicate our rights
and our user's rights to redistribute binary artifacts produced by other
entities. It's a problem we create by a choice to bundle. If we're not
redistributing these other artifacts because we're not bundling them, then
it's not a problem.

That said, it's still nice to try to communicate the redistribution rights
our users will have with our dependencies, so they don't have to track them
down individually. But, this isn't ultimately our responsibility. It's just
a nice thing to do for our user's convenience.

> Changes needed in our LICENSE/NOTICE for a bundled dependency change
> should be getting handled by whomever does each dependency change. Folks
> who review changes (even in our commit-then-review process) should be
> pointing

I agree... and it's not that hard either, but it's a huge time sink when a
dep version is bumped from 1.0.1 to 1.0.2 for a quick bug fix, to check to
see if it's one of the jars we're bundling, download it from Maven Central
(because that's the one we're going to bundle), unpack it, extract the
essential docs, determine what's changed, correct errors and determine what
doesn't need to be copied, and figure out how to copy/paste the required
updates into the structure of our LICENSE/NOTICE files' sections.

> out where due diligence hasn't been done. We spent a ton of time getting
> our
> LICENSE/NOTICE files correct back in September. It'd be super
> disappointing if
> that impact of that effort atrophied.
I agree. I don't want this to atrophy... but given my experience updating
just commons-math, I find it hard to imagine that it won't. Either that, or
we'll just avoid updating bugfixes, security fixes, and adding new
features... and we suffer from that angle instead.

> > But, to the question of whether it's broke... I've seen several cases
> where
> > a version in our lib directory caused a problem with a version of the
> same
> > classes elsewhere in the user's system. The user thought they could just
> > avoid any dependency convergence/reconciliation on their part, because
> they
> > thought Accumulo would just work... and when it didn't, they blamed
> > Accumulo when it was their specific environment which was the problem. If
> > we communicate that responsibility up front, perhaps we wouldn't get
> blamed
> > when users fail to do their due diligence to converge their dependencies
> or
> > when they use wildcards excessively in their classpath configs.
> If the downstream users are going to be fulfilling dependencies themselves,
> should we try to provide an accurate range of versions that we properly
> work
> with?
This is hard to know enough to communicate. I think it'd be better to
establish a baseline, saying "we've tested with X" (and personally, I think
X should be relatively modern/recent), and then if users diverge to be
compatible with older/newer software, they'll know that doing so comes with
a risk that they may need to patch for that updated or previous library.
This is extremely common in downstream packaging/integration. Look at all
the CentOS-specific, etc. patches which exist solely for dependency
convergence (For fun, look at the Fedora-specific Hadoop patches: Trying to test,
document, and communicate a range is much harder, and it conflates upstream
development with integration tasks a bit (IMO).

Not every user gets their software from an intermediate like CentOS, RHEL,
Fedora, Ubuntu, Apache BigTop, Cloudera, or Hortonworks distros. Some users
prefer to get their stuff direct... but these users are typically more
advanced, and should understand that doing so means they take on some
integration responsibilities for their custom environment. The intermediate
packagers/integrators are in a better position to drive widespread
adoption, though, and they are already performing these tasks regardless of
what we're bundling in our binary tarball.

> * barring maneuvers related to "optional" deployment dependencies, natch.
> --
> busbey

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message