lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: [DISCUSS] Archive Lucy
Date Sat, 07 Mar 2009 12:14:28 GMT

On Mar 6, 2009, at 4:58 PM, Marvin Humphrey wrote:

> Grant,
>
> I am currently employed by Eventful, Inc, in San Diego, CA.  They  
> are paying
> me to work full-time on KinoSearch and Lucy.
>
> I went out of my way when we negotiated the terms of my employment  
> to ensure
> that there was no way my contract could hamper or compromise  
> progress towards
> Lucy.  The actual document is confidential of course, but I feel  
> comfortable
> saying that first, our lawyers hammered out the legal nuts and bolts  
> to my
> satisfaction, and second, Eventful is fully on board with regards to  
> Lucy.  By
> way of illustration, my boss regularly hassles me about publishing a  
> Lucy C
> API, even though since Eventful uses the Perl bindings the benefits  
> would be
> indirect.
>

This just further underscores my point.  Lucy cannot be just about you  
(and your employer) contributing code that you develop in-house at  
Eventful.  A project must be able to survive any single committer  
leaving the project and simply put, Lucy does not meet that criteria.   
In the early stages, yes, often one committer gets things going, but  
Lucy's been around for a fairly long time on life support and you only  
seem to pop up on the list when nudged by the PMC.


> In my opinion, it is not in the best interests of the Apache Lucene  
> project to
> make it more difficult for my employer and myself to contribute.

I agree, but unfortunately, it is Lucy that has languished for a good  
long time.

>
>
>> It is fairly apparent to me that the Lucy project is not making any
>> progress community-wise or code-wise.  Neither Marvin, Dave or Doug
>> are active at all on it, and that accounts for all three committers.
>> There has been very little mailing list traffic,
>
> You may have noticed that up until about three weeks ago (when I  
> dove back
> into the code cave), I was quite active on java- 
> dev@lucene.apache.org and in
> the Lucene JIRA forums.  Significant design innovations were realized,
> particularly in the area of real-time search.
>
> In the past, many designs have been hashed out cooperatively on the  
> KinoSearch
> and Lucy mailing lists: the Schema class, revisions to QueryParser  
> and the
> boolean Query hierarchy, the implementation of human-readable index  
> metadata,
> C configuration probing, the OO model, index designs which exploit  
> memory
> mapping, and so on.
>
> In this particular case, however, I was assigned the task of solving  
> real-time
> search, for which the Lucy and KinoSearch forums were not ideal.   
> There is a
> very limited number of people who have both the familiarity with the
> Lucene/Lucy segment-based inverted index model and the interest to  
> discuss
> real-time search at the level I desired, where concepts like  
> "segment-centric
> search" could be bandied about.  Basically, I needed Mike McCandless  
> -- so I
> went to where he could be found.
>
> The conversations that we had in JIRA and on java-dev were  
> beneficial to both
> Lucene and Lucy; should I have posted to the Lucy dev list instead  
> simply to
> demonstrate activity, which would have been less useful to Mike, to  
> me, to
> Lucy, and to Lucene?  To my mind, the Lucene community is also part  
> of the
> Lucy community.  Mike's insights were welcome and useful, and it  
> didn't seem
> important to me which specific mailing list they wound up on --  
> they're all
> under the domain lucene.apache.org, after all.  Weren't we all  
> moving forward
> together, and wouldn't that be apparent to members of the PMC such as
> yourself?
>
> Or is this a zero-sum game where design innovations which help Lucy  
> don't
> count as "progress" if they also help Lucene?

That's all fine, but none of it adds up to people looking at Lucy and  
saying "Gee, I want to contribute to Lucy"


>
>
>> Furthermore, I have my doubts about the development process being  
>> employed,
>> which seems to be the notion that KinoSearch is going to be donated  
>> by
>> Marvin at some point in the future [1], which would only work if it  
>> were to
>> go through the Software Grant or Incubation process (which I would  
>> be happy
>> to support.), or at least that is how I understand the process to  
>> be when
>> code is developed outside of the ASF.
>
> I understand why you might have thought that, but that's not how  
> things will
> play out, and it's a misreading of the post that you cite.
> (<http://www.lucidimagination.com/search/document/152a1a9d00b7d08a/is_there_anybody_here

> >)
>
> As you note, simply importing KinoSearch wholesale into the Lucy  
> repository
> with cosmetic changes would violate the terms of the project.  But  
> even if
> that were possible, it would represent a *horrendous missed  
> opportunity*.
>
> A KinoSearch 1.0 release, with permanent API and file format backwards
> compatibility guarantees -- i.e. "there will never be a KinoSearch  
> 2.0" --
> will be very beneficial for Lucy's development.  Imposing such  
> discipline
> allows library users to proceed with maximum confidence.  For  
> instance, it
> allows Peter Karman, who has long planned to build a KS backend for  
> Swish, to
> move forward without having to worry about the upstream library  
> pulling the
> rug out from underneath his users.
>
> Going that route will maximize our ability to learn the limitations  
> and
> weaknesses of the design.  Using the knowledge we gain, we can then  
> forge
> ahead as we have in the past: chunk by chuck, class by class.  And  
> even though
> I am very pleased with how pluggable index components, C API user  
> interface
> improvements, "OS-as-JVM" file format changes, and so on are coming  
> along, I
> anticipate lots of healthy debate and major discrepancies between  
> what ends up
> in KS 1.0 and what ends up in Lucy.
>
>> Even if KS were the plan, in looking at KS, it seems there is not  
>> much
>> community activity there, either.
>
> This is largely due to the fact that it has been a long time since I  
> released
> any significant public updates.  I choose to release significant  
> updates
> infrequently because breaking backwards compatibility has severe  
> consequences
> for CPAN modules: as soon as the install completes, live apps start  
> crashing.
>
> Since there is no sane deprecation mechanism for dynamically loaded  
> Perl
> modules, minimizing backwards compatibility problems is a  
> responsibility I
> take seriously.
>
>> On the flip side, one might ask what's the harm in letting it stand  
>> as
>> is?  Admittedly, not much, other than I think it confuses people b/c
>> they think there is a C port of Lucene and then they go and find it  
>> is
>> dead.
>
> Indeed.  It's not like Lucy in its present form causes harm to the  
> bottom line
> of Lucid Imagination, Inc. ;)

What's that got to do with anything?  Give me a break.  I'm not  
attacking you.  I'm just stating that Lucy has not had any code or any  
community built for over three years.

>
>
>> Therefore, it is with some hesitation that I suggest we mothball
>> Lucy.  Mostly, I hesitate, because I hate to see any project be
>> archived on the hope that someone will come in and pick it up.
>>
>> However, I just don't see that happening.  If Marvin wishes to
>> resurrect it, he can donate KS (or whatever core part of it is Lucy)
>> and go through incubation and prove there is a community and then we
>> can turn it back on.
>
> Please give me two to three months to make the next dev release of  
> KinoSearch.
> FWIW, if I can't get a release out within that time frame, I'm going  
> to have
> to answer to Eventful. :)
>
> This release will introduce real-time search, improved subclassing  
> support, an
> mmap-friendly index file format, and pluggable indexing components.   
> I suspect
> aspects of it may be of interest to the Java Lucene dev community --  
> but if
> that's the case, I won't hold it against you. ;)
>

Again, this is all great, but it just further demonstrates that you  
are doing this on your own and not as a part of the Lucy community (or  
really, even the Lucene community).  It's not a judgment of you or of  
KS.  I really like what you are doing.  It's merely a statement that  
this is not how Apache works.   There are plenty of other places to  
host code that do not have these requirements.



Mime
View raw message