lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Incubator or out?
Date Sat, 26 Jun 2010 17:57:40 GMT
Greets,

>From a technical standpoint, the Lucy project has attained two major
milestones within the last year:

  * The Clownfish object system has been finished.
  * We have devised a mechanism marrying together Lucene's write-once, 
    segment-based file format and memory-mapped IO which yields numerous 
    benefits including near-real-time search, minimized process RAM 
    requirements, OO conveniences such as "cheap searchers", and more.

Furthermore, we have a beautifully working Lucy prototype: the dev branch of
KinoSearch, which has been retrofitted to use the Lucy core.  The next step for
Lucy is to flesh out indexing and search classes.  However, license
constraints, Apache IP policies, and the fact that Lucy was proposed as a
from-scratch development project are all going to complicate that task.
Maybe we should consider another way forward:

Why not declare victory and make our prototype our product?

Perhaps the best way to speed Lucy along is to submit an Incubator proposal
which includes a collective software grant for KinoSearch, which we would
subsequently rebadge as Lucy.  In its recent review of Lucy's status, the
Lucene PMC overwhelmingly identified community building as the most important
task for us to focus on.  The fastest way for us to get to the release that we
need in order to attract new contributors and grow our community is to fill
the remaining voids in the source tree via IP clearance work rather than
software development.

Furthermore, I have come to believe that some time in the Incubator would be
valuable for us.  In retrospect, there are a number of things that Lucy missed
out on because it did not go through the Incubator.  Doug briefed us on the
Incubator's IP clearinghouse role and I had browsed the incubator.apache.org
website when I became a committer, but I have only come to *fully* appreciate
its role in guiding young communities recently, while exploring in earnest
what it will take for Lucy to govern itself responsibly as a top level
project.  

The original Lucy proposal was drawn up from scratch rather than from an
Incubator template, so we did not engage in the excercise of self-critique and
goal-setting which the current process encourages.  I think that crafting a
proposal would in and of itself be beneficial.  I also think that it would be
helpful to be held accountable via the Incubator's regular reporting schedule,
for the same reasons that I welcomed the Lucene PMC's recent request for
regular reports from Lucy.

A few weeks ago I subscribed to the general@incubator list, which has made for
interesting reading.  One ongoing discussion that is relevant to Lucy is the
proposal under consideration for the Chukwa project, which is being spun off
from Hadoop.  The developers of Chukwa have chosen to seek guidance from the
Incubator rather than petition for immediate graduation to TLP status.  Their
rationale was that none of the committers involved had served on a PMC before.
I think we would be well-advised to follow the Chukwa dev team's example.

There are five individuals whose participation in a collective software grant
for KinoSearch would be either essential or very, very helpful: myself, Peter,
Nate, Father Chrysostomos, and Chris Nandor a.k.a. Pudge.  I assume that Peter
and Nate are amenable in principle.  I have contacted Father Chrysostomos and
Pudge, and they are, too; Pudge still works for Slashdot and has received a
preliminary OK from management.  There are numerous others and it would be good
to get as many as possible on board, but those are the five who have made
multiple significant contributions and whose work would be most difficult to
disentangle.

My guess is that the chances of getting an Incubator proposal passed are
pretty decent.  However, we have weaknesses, particularly regarding the
current size of our dev community, and we should consider what course of
action we might take in the event that it does not pass.  

Regardless of what happens, I think that Lucy's interests are best served by
assimilating the KinoSearch code base and pursuing commuity development
aggressively.  If we can't go through the Incubator, that would mean leaving
Apache, at least for the time being.  However, I strongly believe that Lucy
will be healthiest if it is governed by a diverse group of stakeholders rather
than a BDFL, that the controlled competition of meritocratic community
development breeds excellence and that Lucy's architecture encourages it, and
that the project is best served in the long run being bound by Apache
institutions -- and thus I believe that if we do leave, we should seek to
return once we have grown.

If we leave, we would probably want to release Lucy under KinoSearch's current
GPL/Artistic license rather than go through the legal headaches of attempting
to transition to the Apache 2.0 license without the benefit of Apache's
institutions to make the process easier.  Given the quality of the product, I
expect that we will see at least as much activity around Lucy as we did around
KinoSearch circa 2007-2008 and probably more, especially once we fork off a
stable Lucy1 release and give people the backwards compatibility guarantees
that KinoSearch has never given them.  Ironically, the more success we have in
attracting contributions, the more complicated the eventual software grant
becomes.  But in a way, that would be a nice problem to have.  :)

Marvin Humphrey


Mime
View raw message