river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Trasuk <tras...@stratuscom.com>
Subject [Vote] (RIVER-432) Jar files in svn and src distributions
Date Mon, 10 Feb 2014 19:50:17 GMT

As discussion has settled somewhat, I would like to call another vote to accept the latest
patch described in 

The patch removes the archived jar files for Velocity and ASM and replaces them with Apache
Ivy scripts that download the jars from Maven Central the first time a build is done.  From
then on, the jar files are in Ivy’s repository (for more info, see http://ant.apache.org/ivy).

Voting will remain open at least until 2000 UTC Feb 13, 2014.



On Jan 3, 2014, at 12:57 PM, Greg Trasuk <trasukg@stratuscom.com> wrote:

> On Jan 3, 2014, at 5:25 AM, Simon IJskes - QCG <simon@qcg.nl> wrote:
>> In order to gain some time to discuss this first i will vote -1.
>> First, we decided to NOT remove velocity builder.
> When I read the email chain, my impression was that we wanted to remove it (to quote
you Sim, “To be honest, I hate it”), but there was a dependency on it in the ‘extras’
folder that was added in the trunk branch.  As there is no ‘extras’ in the 2.2 branch,
and that is what this patch applies to, I thought it was fair to remove VelocityConfigurationBuilder
from the 2.2 branch.   Perhaps we should revisit the ConfigurationBuilder approach in another
thread.  For now I’ll spin another patch that doesn’t remove VelocityConfigurationBuilder.
>> Second, no need to remove the jars as specified in your own comments on RIVER-432.
>> Pulling in external jars at compile time, shall we start here?
>> They are already in the svn. They are already in the build scripts. What does this
patch fix? No legal problems?
> Apache policy is somewhat unclear on this point.  One needs to examine the mailing lists
for clues on what we should really do.  I will argue that:
> 1 - The fundamental distribution model of Apache is source code, not binaries.
> 2 - Distributing binaries is tolerated but not encouraged.  Since the svn repository
can be seen as a distribution point, binaries in svn are also tolerated but not encouraged.
> 3 - Downloading dependency binaries at build time is technologically easy, provides the
same guarantees as putting them in cvs, and avoids the question of effectively distributing
someone else’s code.
> All these together suggest that although we’re technically OK to put dependency jars
in a “-deps” package (note that the status quo _is_ unacceptable - at the very least,
we need to restructure the dependencies into a “-deps” binary package), there is some
policy uncertainty which we avoid totally by having dependencies downloaded from a known-good
source at build time.
> Let’s examine these points:
> 1 - The fundamental distribution model of Apache is source code, not binaries.  Apache
began with httpd.  Back in those days, “Open Source” software was distributed in source
form only, simply because it was mostly intended for Unix systems (then later Linux).  I recall
the first release of Perl coming down as a ten-part uunet news message.  Part of this distribution
model was practical necessity - System differences made it necessary to compile your software
on the hardware it was going to run on.  Part of it was open-source philosophy.  Having the
source code meant that you could take a look at it and verify that it wasn’t hazardous to
your operations.  
> In any case, the way we use to use open source software was (“./configure; make; make
install”).  If the software had dependencies, you built them from source, for the same reasons.
> Now, as time has gone on, we’ve gotten used to having the JVM as a common runtime,
and jar files as a common binary distribution medium.  But the Apache Foundation’s mandate
is to produce open source software that is freely usable under the Apache License.  That means
source code is at the heart of Apache, despite the rest of the world’s comfort with binaries.
 Hence Roy’s statements in (1):
>> Class files are not open source.  Jar files filled with class files
>> are not open source.  The fact that they are derived from open source
>> is applicable only to what we allow projects to be dependent upon,
>> not what we vote on as a release package.  Release votes are on verified
>> open source artifacts.  Binary packages are separate from source packages.
>> One cannot vote to approve a release containing a mix of source and
>> binary code because the binary is not open source and cannot be verified
>> to be safe for release (even if it was derived from open source).
>> I thought that was frigging obvious.  Why do I need to write documentation
>> to explain something that is fundamental to the open source definition?
> He’s talking about binary packages, not jar files in svn, but I read that (and many
other emails) as a distaste for binary distributions.
> In fact, if you look at Apache httpd’s download page, it doesn’t appear that the
Apache project publishes any Unix or Linux binaries.  They leave that to other organizations.
> 2 - Distributing binaries is tolerated but not encouraged.  Since the svn repository
can be seen as a distribution point, binaries in svn are also tolerated but not encouraged.
> It’s hard to find a single reference that encapsulates this outlook, but that’s the
impression I get from reading the various mailing lists.  For instance, Sam Ruby says (2):
>> IMO, our projects release source. So, our projects should not maintain object/binary
>> in their svn release tree, regardless of license (category a or b).
> There is some debate on whether the svn tree should be considered a distribution point.
 Incubator releases are regularly called out for not having “NOTICE” and “RELEASE”
files at all reasonable checkout points in svn.  [LEGAL-26] (https://issues.apache.org/jira/browse/LEGAL-26)
concerns this and remains unresolved.
> Doug Cutting (3) says:
>> On Mon, Sep 16, 2013 at 2:50 AM, Stephen Connolly
>> <stephen.alan.connolly@gmail.com> wrote:
>>> * Source control is not an Apache distribution and hence we do not need to
>>> have LICENSE and NOTICE files in source control, it can be a nice
>>> convenience, but there is no *requirement*.
>> I think perhaps you're looking for clear lines where things are
>> actually a bit fuzzy.  Certainly releases are official distributions
>> and need LICENSE and NOTICE files.  That line is clear.  On the other
>> hand, we try to discourage folks from thinking that source control is
>> a distribution.  Rather we wish it to be considered our shared
>> workspace, containing works in progress, not yet always ready for
>> distribution to folks outside the foundation.  But, since we work in
>> public, folks from outside the foundation can see our shared workspace
>> and might occasionally mistake it for an official distribution.  We'd
>> like them to still see a LICENSE and NOTICE file.  So it's not a
>> hard-and-fast requirement that every tree that can possibly be checked
>> out have a LICENSE and NOTICE file at its root, but it's a good
>> practice for those trees that are likely to be checked out have them,
>> so that folks who might consume them are well informed.
> Again, he’s not talking directly about jar files in svn, however I think his statement
that “since we work in public, folks from outside the foundation can see our shared workspace
and might occasionally mistake it for an official distribution” applies here.  Fundamentally,
the decision on how and where to distribute ‘velocity.jar’ rightly belongs with the Velocity
group and I don’t think we ought to redistribute it.
> 3 - Downloading dependency binaries at build time is technologically easy, provides the
same guarantees as putting them in cvs, and avoids the question of effectively distributing
someone else’s code.
> There doesn’t seem to be clear policy in the ASF on this, as evidenced by the frequent
debates on it, and the lack of documentation.  I’ve tried to lay out an argument that having
jars in svn is not encouraged by the ASF (really, it’s not in line with the ASF’s charter),
even if it isn’t disallowed.  You may disagree, and I won’t claim I’ve made a strong
argument, simply because the policy isn’t clear.  So instead of going through arguments
that amount to differences of opinion on Apache policy, let’s use a technological solution
that is simple, common, and avoids the question altogether, by automatically downloading the
dependencies at build time.
> Projects that use Maven do this automatic download as standard practice (that’s what
Maven does, and that’s what the Maven Central infrastructure is there to support).  We don’t
use Maven, which is fine (our customers have asked us to make our binaries available in Maven
Central, and we’ve done that).  Apache Ivy is a popular add-on to Apache Ant that provides
similar dependency resolution to an Ant-based build.
> I was a little surprised how easy it was to persuade Ivy to get the required dependencies
at build time.  The “ivy.xml” file is 39 lines including the ASL header (which by the
way I forgot to include in the patch - I’ll fix that).  There are about 50 lines added to
‘build.xml’ to download Ivy and then download the required jar files
> So, given that the status-quo seems to be unacceptable (Roy talks about not having jar
files in the open-source trees, only in “-deps” and “tools” trees), we have two options:
> (a) - restructure the svn repository and the build to allow a separate “-deps” distribution.
 This wouldn’t affect our binary distributions (note that I’m no longer using the term
“binary release”), but to build from source, a user would have to download a separate
archive, unpack it, and then copy those files into the directory that was unpacked from the
source release.  This option effectively still has us distributing dependent binaries, which
is not the goal of the ASF, just with a disclaimer that says “this isn’t an ASF release,
its just a binary distribution put together by a committer for your convenience, so don’t
feel you should trust it too much”.
> (b) - use Ivy to get the jars from Maven Central automatically as part of the build.
> I think (b) is the option that causes the least hassle for our downstream consumers,
and not much hassle for us.
>> Pulling external jars at compile time also makes it more difficult to certify the
software. In order to certify the software you need to establish baseline that will be garanteed
the same, even if you pull it from the archive 10 years later.
> As I said above, Apache’s focus is creating software that can be built from source,
not distributing binaries (note that QCG or another company might have a different focus,
and is perfectly able to distribute binaries under the Apache license).  I think a reasonably
prudent user would ask “How can I trust the ‘velocity.jar’ that’s included in this
binary?”  And the answer would be either “because I built it from source and installed
it in my corporate repository” (very cautious, but not unheard-of) or “It was published
by the Velocity group to a trusted repository, Maven Central” (more common).
> If you look in the “ivy.xml” file you’ll see that the dependencies are specified
using Maven-style “group-artifact-version” coordinates.  Old versions are maintained in
Maven Central forever.  I suppose it’s possible that a publisher could convince Maven Central
to remove a version for some reason (security or licensing problems perhaps), but then, would
we want to be distributing that version in a “-deps” package?
> I agree that it’s not enough to just say “you need to download such-and-such jar”,
hence the automatic download managed by “Ivy” from Maven Central.
>> It is not a high level project that builds on several frameworks. It is a lowlevel
system library. The stuff below the stack is minimal. The number of jars we use is limited.
Why bother?
> In the currently released branches, the dependencies are limited to ASM and Velocity.
 Looking forward to the trunk branch and the qa_refactor branch, the number of external dependencies
seem to be increasing (IMO I don’t like that, because I also view River as a low level system
library, but I’m only one PMC member).  We need to get in front of the problem before we
start distributing large numbers of dependencies.
> This point rolls in with the general question of jar files in version control.  I was
always taught that version control was for source code, and putting binaries into version
control was a bad idea.  In addition, there are practical problems - with older systems like
cvs, even doing an update or commit effectively downloads the binaries, which slows things
down if there are large binary files.  On newer distributed version control systems like git
or Mercurial, the entire repository, including all versions of binary artifacts, comes down
with the project checkout.  Currently, we have one version of relatively few jar files in
our repository, so it’s not a major issue.  But it gets worse as time goes on.  So I suggest
we work out the technology now to avoid the problem.
>> Gr. Simon
> Thanks for the questions, Sim.  I hope you’ll come around to removing your ‘-1’.
> Cheers,
> Greg
> Footnotes
> ——————
> (1) - Roy Fielding - http://s.apache.org/roy-binary-deps-1
> (2) - Sam Ruby - http://s.apache.org/r5
> (3) - Doug Cutting - http://s.apache.org/GNP
>> On 02-01-14 18:22, Greg Trasuk wrote:
>>> Hello all:
>>> Please have a look at the patch mentioned below and cast a vote on it.
>>> The main idea is to remove the dependency jar files from the source distribution.
 As a side effect of using Ivy, it becomes reasonable to remove them from the svn archive
as well.  Also, the Velocity dependency was there to support the VelocityConfigurationBuilder.
 We had discussed removing that component, so rather than move that dependency to Ivy, I’ve
removed VelocityConfigurationBuilder.
>>> It’s arguable whether the VelocityConfigurationBuider was part of the official
Jini API (I see it as a utility, not API), so I don’t think this commit actually requires
a vote.  However, it does seem like a significant change to the build process that ought to
be reviewed.  So I propose to treat this as a “lazy consensus” vote, and will commit the
change to the 2.2 branch if there are no objections in 72 hours (i.e. 1730UTC 20140105).
>>> At the same time, based on discussions over on general@incubator.apache.org,
I’ll withdraw my assertion that we can’t have jars in svn.  Those interested may want
to check out the thread at http://mail-archives.apache.org/mod_mbox/incubator-general/201312.mbox/%3C01B04CC4-95B8-4A39-BC16-04BAA4269B65%40stratuscom.com%3E
>>> Cheers,
>>> Greg.
>>> On Jan 2, 2014, at 12:05 PM, Greg Trasuk (JIRA) <jira@apache.org> wrote:
>>>>    [ https://issues.apache.org/jira/browse/RIVER-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>>>> Greg Trasuk updated RIVER-432:
>>>> ------------------------------
>>>>   Attachment: river-2_2_remove_jars.diff
>>>> The attached patch for the 2.2 branch does the following:
>>>> - removes the 'asm' directory and 'tests/lib' directories which currently
contain the asm library, mockito, and junit jars.
>>>> - Modifies 'build.xml', 'common.xml', and adds 'ivy.xml' so that the Apache
Ivy ant plugin is downloaded at build time, and then used to retrieve the libraries mentioned
above from Maven Central.  This removes the need to have the jar files in svn.
>>>> - Removes (as per discussion http://mail-archives.apache.org/mod_mbox/river-dev/201211.mbox/%3C509B99E3.6080800%40qcg.nl%3E)
the VelocityConfigBuilder, and associated Velocity jars.  Note that the 'extras' folder is
not present in the 2.2 branch, so Sim's last comments in the thread do not apply.
>>>>> Jar files in svn and src distributions
>>>>> --------------------------------------
>>>>>               Key: RIVER-432
>>>>>               URL: https://issues.apache.org/jira/browse/RIVER-432
>>>>>           Project: River
>>>>>        Issue Type: Bug
>>>>>          Reporter: Greg Trasuk
>>>>>       Attachments: river-2_2_remove_jars.diff
>>>>> Recent traffic on the incubator lists has pointed out that including
jar files for dependencies in the subversion repository and the source distributions is against
Apache policy.
>>>>> In River, the following libraries appear in the Subversion repository
and the source distributions (these are from trunk, a smaller set appear in the 2.2 branch):
>>>>> animal-sniffer
>>>>> asm
>>>>> bouncy-castle
>>>>> dnsjava
>>>>> high-scale-lib
>>>>> rc-libs
>>>>> velocity
>>>>> They all have to go.  What are we using them for?  As I understand it,
we were going to remove the VelocityConfigurationBuilder, so that's not a problem.  Some of
the others are available from Maven Central, so we can get them at build time using Ivy or
another build tool.  Which ones are actually required?  And where did they come from?
>>>> --
>>>> This message was sent by Atlassian JIRA
>>>> (v6.1.5#6160)
>> -- 
>> QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
>> Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

View raw message