incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject [lucy-dev] LucyX
Date Fri, 11 Mar 2011 00:23:24 GMT
Greets,

We currently provide a number of modules under the "LucyX" namespace:

    LucyX::Simple
    LucyX::Index::ByteBufDocReader
    LucyX::Index::ByteBufDocWriter
    LucyX::Index::LongFieldSim
    LucyX::Index::ZlibDocReader
    LucyX::Index::ZlibDocWriter
    LucyX::Remote::SearchClient
    LucyX::Remote::SearchServer
    LucyX::Search::Filter
    LucyX::Search::MockScorer
    LucyX::Search::ProximityQuery

These modules present various levels of maturity, usability and documentation.
All save ProximityQuery are implemented in pure Perl.  Some used to live under
the KinoSearch namespace, but were moved underneath KSx; others began life
under KSx.

I would ultimately like to see the Lucy core *shrink*, for example by moving
SnowballStemmer and SnowballStopFilter out.  If they were to follow the
example of e.g. SearchClient and SearchServer, that would imply moving them
under LucyX -- but LucyX doesn't seem like quite the right mechanism.

Right now, the LucyX namespace is sort of sandboxy.  In contrast,
SnowballStemmer and SnowballStopFilter are not sandboxy -- they are stable and
widely used.  I don't think they should live in the same shared object as the
Lucy core, but they should continue to receive the same level of attention and
committment they receive now.

Perhaps we can look to the Apache HTTPD webserver as a model.  HTTPD has a
"modules" directory, where items such as mod_ssl live:

  https://svn.apache.org/repos/asf/httpd/httpd/trunk/modules

IMO, our Snowball materials should live in an analogous directory.  Like
mod_ssl, they would not be a prerequisite for compiling the core library, yet
would remain an intrinsic part of the distibution.

With regards to the existing classes under LucyX, I think we might consider
migrating a number of them out of our repository and into the "Apache Extras"
hosting service set up by Google.  

  http://blogs.apache.org/foundation/entry/the_apache_software_foundation_launches
  http://googlecode.blogspot.com/2010/12/announcing-apache-extrasorg.html

ProximityQuery can't move at this time because its Matcher is written in C an
and we haven't published a Lucy C API, and I think we might move LucyX::Simple
to Lucy::Simple... but everything else, we could offload.  Moving those items
apache-extras would set a powerful example, and would establish a sandbox that
does not need to be constantly monitored and curated by the Lucy PMC.

I don't think it's urgent that we make these moves before 0.1.0-incubating.
Nevertheless, I would like to have at least thought about how best to
accommodate the contributions we hope to receive in the future.  Though it is
dwarfed by Lucene's gigantic code base, Lucy is large already; we should
ensure that getting something committed to trunk/core/ is not the only way to
scratch an itch.

Marvin Humphrey


Mime
View raw message