avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Massie (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-163) Each language Avro supports should be a separate package
Date Wed, 21 Oct 2009 01:11:59 GMT

    [ https://issues.apache.org/jira/browse/AVRO-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768044#action_12768044

Matt Massie commented on AVRO-163:

bq. The top-level build should probably use a build system, as it will use dependencies. For
example, running interop testing requires first building all ports, then launching their daemons,
then running the tests, and finally shutting everything down. Similarly, coordinating a pan-language
documentation build may not be trivial. 

You're probably right here.  We might want to use 'ant' or 'make' as the top-level build system
in order to have the dependency management for interoperability tests since it requires tight
coordination between languages (e.g. starting/stopping daemons).  I'd be surprised if there
would be any pan-language dependencies for building documentation.

bq. Also, a single release artifact simplifies the release process. We might subsequently
break it into multiple artifacts, but we expect to make coordinated releases and hence for
folks to vote on all of the implementations at once, and a single artifact makes it clear
that we're all in agreement.

I agree with you that the team should be in complete agreement during a release.  However,
I'm not sure that a single release artifact demonstrates team agreement any more or less than
having the word 'avro' and the umbrella release version in multiple artifact names at the
same base URL.  To be clear, I'm not advocating that each language have a separate release
cycle.  That fragmentation would be bad for the team and confusing for users.  I'm just saying
that when we run 'ant release' or 'make release' that multiple artifacts be generated instead
of one.  We should all vote at the same time for the release artifacts that comprise each

bq.  It's useful to easily grab independent parts of the release artifact, either as subdirectories
or as nested archives. We currently include the jars in the release and also then push them
to Maven. Other distributions should also be trivial to extract from the release artifact.

Would this monolithic release artifact be a light-weight source-only artifact like thrift
and protobuf have (~1MB)?  I might be able to use a source-only release as the pristine basis
for packaging Avro for Linux distributions.  Binary packages shouldn't be used for pristine
source, e.g. from the Fedora packaging guidelines:

When you encounter prebuilt binaries in a package you MUST:

    * Remove all pre-built program binaries and program libraries in %prep prior to the building
of the package. Examples include, but are not limited to, *.class, *.dll, *.DS_Store, *.exe,
*.jar, *.o, *.pyc, *.pyo, *.so files.
    * Ask upstream to remove the binaries in their next release.

If we continue to release a monolithic binary release artifact with loads of third-party jar
files, I'll be forced to continually track the binary files and remove them as part of my
C/C++ packaging effort.

Alternatively, the nested archives idea might work but I've never seen it done in practice.
 Part of the problem is that you have to submit a full URL to the pristine source package
for the program you are packaging and I've never heard of a package maintainer pointing to
a tarball within a tarball for pristine source.  If we're already generating the tarball as
part of the release process, why not just drop it on the web server and give it a unique URL
for Debian/RPM packaging in the process?

* Debian New Maintainer's Guide http://www.debian.org/doc/manuals/maint-guide/index.en.html
* Fedora Packaging Guidelines https://fedoraproject.org/wiki/Packaging/Guidelines

> Each language Avro supports should be a separate package
> --------------------------------------------------------
>                 Key: AVRO-163
>                 URL: https://issues.apache.org/jira/browse/AVRO-163
>             Project: Avro
>          Issue Type: Improvement
>          Components: c, c++, java, python
>    Affects Versions: 1.0.0, 1.1.0, 1.2.0
>         Environment: We currently release Avro as a single monolithic tarball with ant
being used to build all the languages that Avro supports.
>            Reporter: Matt Massie
>            Assignee: Matt Massie
>            Priority: Critical
>             Fix For: 1.2.1, 1.3.0
>   Original Estimate: 8h
>  Remaining Estimate: 8h
> *Build Issue*
> While ant is used for building Java projects, it is almost never used to build python,
c++ or c projects.  C and C++ projects are often managed using autotools while Python uses
setuptools.  Forcing these languages to use a foreign build system ('ant') is suboptimal and
will cause us headaches as we move forward.
> *Release issue*
> Releasing a single monolithic package forces users of one language to download binary
and source for all languages.  For example, at this time the Avro C distribution is only 384K
in size (built using autotools 'make distcheck' target).  People interested in using the C
implementation would be forced to download a large monolithic tarball (currently 3.8 MB) that
includes dozens of third-party jar files for the Java implementation.  Furthermore, C users
would be forced to use 'ant' as the top-level build tool.  This monolithic approach would
also prevent us from submitting Avro for inclusion in Linux distribution yum/apt repositories
as RPM and Debian packages.  It's important to allow C/C++ code to have a pristine release
tarball on which to base Debian and RPM packaging.
> *Solution*
> Create top-level directories: 'java', 'python', 'c++ ' , 'c', 'shared' and 'release'.
 Each language directory would contain the source for that language and use the build system
natural for that language, e.g. ant, autotools, setuptools, gem, etc.  The 'shared' directory
would have, for example, common test schema and data files for interoperability testing between
each language.  A simple top-level bash script would call into each language to build a release
package, documentation, etc. into the 'release' directory.  Each Avro release would then be
compromised of package(s) for each language Avro supports, e.g. avro-java-1.2.3.tar.gz, pyavro-1.2.3.tar.gz,
avro-c++-1.2.3.tar.gz and avro-c-1.2.3.tar.gz.  Later on, we'll also likely have libavro-devel-1.2.3-1.x86_64.rpm

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message