lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject [lucy-dev] Build from top dir
Date Tue, 07 Dec 2010 00:43:51 GMT

The Lucy repository is currently set up so that each binding language gets a
dedicated subdirectory, an arrangement which averts collisions between files
and folders required by different hosts.  For instance, both Perl and Ruby
want to lay claim to a 'lib' directory; since they can't both have 'lib' at
the top level, we give them each one within their subdirectory:


Right now, to build Lucy using the Perl bindings, you must first cd to
$dist_root/perl/, then run standard incantations:

    cd perl/
    perl Build.PL
    ./Build test

There are a number of files and folders which you would ordinarily expect to
find at the root level of a CPAN distribution, but in the Lucy repository,
they are all one level down, underneath the perl/ directory:

    $dist_root/perl/lib/        # contains .pm files
    $dist_root/perl/t/          # contains test files
    $dist_root/perl/META.yml    # Needed by CPAN.

This repository layout is not compatible with CPAN, which requires either a
Build.PL or a Makefile.PL at the top level, and also wants a META.yml file at
the top level from which metadata can be extracted without running code.  If
we simply perform an "svn export" of the current Lucy repository and package
it up as a tarball, all the automated tools that deal with CPAN archives will
be at a loss: those tools can't read a README or INSTALL document which tells
them to cd to perl/ first.

To address this problem, the "dist" build target for the Perl bindings moves a
bunch of files around, eventually wrangling a tarball with a CPAN-compatible
layout.  This has worked fine for KinoSearch, which has had only a Perl
target.  Lucy, however, will have multiple targets.

The logical extension of our current system is to build one release archive
for each host language which has a dedicated distribution network (while other
hosts which do not have such a network, such as Objective C, would get a
general-purpose tarball):

    Perl   => CPAN 
    Python => PyPI 
    Ruby   => 
    PHP    => PEAR
    Java   => Maven

However, the PPMC cannot be burdened with QCing and voting on all those
archives.  There will have to be a single release tarball which is basically a
snapshot of the repository (probably augmented with some autogenerated files);
that's what the PPMC, and ultimately the Incubator PMC and the ASF, will

Some projects at Apache make binary release artifiacts available in addition
to the source releases.  However, only the source release is official; binary
releases are considered volunteer efforts.  Dedicated Lucy release archives
for CPAN, PyPI, etc. would fall under the same classification.

As an experiment, I've been playing around with moving Build.PL into the top
level of the Lucy repository.  The rationale is that if we make the Lucy
repository layout CPAN-compatible, we can reuse the blessed tarball as our
CPAN tarball as well.  This spares us from having to derive and verify a
dedicated CPAN distribution tarball, avoiding extra work and lessening the
opportunity for error.

It took some tweaking, but eventually I was able to coerce the Perl build to
work properly with Build.PL up at the top.  I've uploaded a patch to JIRA as

If we take this same approach for other host languages with dedicated
distribution networks, in time we end up with a messy top-level directory:

   $dist_root/Build.PL      # Perl/CPAN
   $dist_root/Rakefile      # Ruby/Rubygems
   $dist_root/      # Python/PyPI
   $dist_root/pom.xml       # Java/Maven
   $dist_root/Changes       # Shared.
   $dist_root/LICENSE       # Shared.
   $dist_root/NOTICE        # Shared.
   $dist_root/MANIFEST      # Shared.
   $dist_root/README        # Shared.
   $dist_root/META.yml      # Needed by CPAN.
   $dist_root/core/         # Core source code.
   $dist_root/modules/      # Core source code.
   $dist_root/c/            # C bindings.
   $dist_root/java/         # Java bindings.
   $dist_root/perl/         # Perl bindings.
   $dist_root/objective_c/  # Objective C bindings.
   $dist_root/php/          # PHP bindings.
   $dist_root/python/       # Python bindings.
   $dist_root/ruby/         # Ruby bindings.

Also, while I believe that CPAN, PyPI, and the gem format can all be made to
work with an alternate layout, a cursory investigation suggests that PEAR
archives are uniform, sparse, and look nothing like the Lucy repository.  I
suspect we'll have to roll dedicated PEAR releases regardless.

Due to the eventual messiness, I have mixed feelings about the patch in
LUCY-130.  However, I've come to believe that it's a decent
short-to-medium-term solution.  

  * It makes our release process more reliable at the time when we are still
    getting used to Apache institutions and are most likely to make mistakes.
  * It solves bugs with the current "dist" target, which would have
    omitted the mandatory files LICENSE and NOTICE (since they reside at the
    top level).  
And, if down the road we decide that the top level dir has gotten unwieldy, we
have the freedom to change the layout and opt for dedicated release tarballs
after all -- ultimately, repository layout is an implementation detail rather
than a public API.

Marvin Humphrey

View raw message