lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: [lucy-dev] Generalize Tutorial for multiple host languages
Date Mon, 08 Nov 2010 19:00:38 GMT
On Sun, Nov 07, 2010 at 01:07:38PM -0600, Peter Karman wrote:
> HTML::Entities is licensed under the same terms as Perl, which means it may be
> redistributed under the Artistic License or the GPL. But it's up to the
> redistributor to decide, yes?

Hmm, yes.  I suppose that means that we could assert that our usage of those
modules within the tutorial/sample code was under the terms of the Artistic
License and *not* under the GPL.  The question then becomes whether such usage
is compatible with distribution of the tutorial/sample code under the Apache
License 2.0.

For what it's worth, I think switching in CGI::escapeHTML() for some usages of
HTML::Entities::encode_entities() is OK.  I changed the charset for search.cgi
from Latin-1 to UTF-8, so it's no longer necessary to encode code points above
255 as HTML entities.  The only important thing we need entity encoding for
now is to guard against cross-site-scripting attacks, and for that,
CGI::escapeHTML() suffices.

Switching in CGI::escape() instead of encode_entities() for URL encoding is
actually a bugfix.  (If we were going to use a CPAN module for that, it should
have been URI::Escape, which offers the function uri_escape_utf8().)
CGI::escape() would be a gimme except that it was silently dedocumented back in
2005, with version 3.06 when cgi_docs.html was abandoned in favor of the
module POD -- escape() and unescape() didn't make the jump from the old
documentation to the new.  CGI is on version 3.49 now and it's distributed with
the Perl core, so escape() isn't going anywhere -- it's safe to use, just no
longer publicly documented.

> Same is true of HTML::TreeBuilder and Data::Pageset.

Data::Pageset is a mild improvement at best.  HTML::TreeBuilder is no longer
necessary if we go with a plain text corpus.

In conclusion... for the primary Tutorial documentation, we can and arguably
*should* eliminate all non-core-Perl dependencies -- if for no other reason
than making it easier to run the sample code and go through the Tutorial.

Elsewhere, though, there are two Perl-licensed modules that we *do* care

  * Parse::RecDescent, for the Clownfish compiler and for
  * JSON::XS for Lucy itself, until we write our own JSON parser.

> So I don't see why it's necessary to reinvent those dependencies. The Artistic
> license is *not* listed under the "Category X" page at

The Artistic License isn't listed at all on that page, which means it hasn't
yet been ruled on.  Usage has been discussed in
<>, but that deals with sample
data, not code dependencies.  It's also come up on legal-discuss@a.o, but it's
never resulted in an official outcome.

We should take this up with legal and ask for clarification.  It would be nice
if we didn't have to deal with replacing JSON::XS right away, but could put
that task off until after the first release.

Here's a draft of the question we might ask legal via JIRA:

    The Apache Lucy Incubator podling is working to pare down its list of
    dependencies, but there are two CPAN distributions which we would like to
    put off replacing for the time being (Parse::RecDescent and JSON::XS).
    These two distributions are both licensed, as is common for CPAN modules,
    under the "same terms as Perl itself".  Perl's licensing is here:

    We do not wish to bundle these CPAN distributions with Lucy, but instead
    specify them as prerequisites.  We assert that our usage of the modules in
    question falls under the terms of the Artistic License and *not* the GPL.

    Lucy interfaces with these modules in three places:

        * At build time (Parse::RecDescent).
        * Within Lucy itself at runtime (JSON::XS).
        * Within sample/cookbook code (Parse::RecDescent).
    We have two questions:

    Is it acceptable for code released under the Apache License 2.0 to have a
    non-optional dependency on code which is licensed under the Artistic

    Is it acceptable to classify these modules as "system dependencies", which
    the user is expected to install?

Marvin Humphrey

View raw message