lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen" <>
Subject Re: [jira] Commented: (LUCENE-1473) Implement Externalizable in main top level searcher classes
Date Thu, 04 Dec 2008 21:24:45 GMT
Correction: Powerset apparently did not use Lucene.  And apparently there
are a few other companies who are not open sourcing, use Lucene
serialization regularly.

> Did you pay Michael?  No one here is compelled to work with anyone else.
 We work with others when we feel it is in our mutual self interest.

Nice... I guess our government is the macrocosm.

On Thu, Dec 4, 2008 at 11:21 AM, Jason Rutherglen <> wrote:

> To put things in perspective, I believe Microsoft (who could potentially
> place a lot of resources towards Lucene) now uses Lucene through Powerset?
> and I don't think those folks are contributing back.  I know of several
> other companies who do the same, and many potential contributions that are
> not submitted because people and their companies do not see the benefit of
> going through the hoops required to get patches committed.  A relatively
> simple patch such as 1473 Serialization represents this well.
> For example if a company is developing custom search algorithms, Lucene
> supports TF/IDF but not much else.  Custom search algorithms require
> rewriting lots of Lucene code.  Companies who write new search algorithms do
> not necessarily want to rewrite Lucene as well to make it pluggable for new
> scoring as it is out of scope, they will simply branch the code.  It does
> not help that the core APIs underneath IndexReader are protected and package
> protected which assumes a user that is not advanced.  It is repeated in the
> mailing lists that new features will threaten the existing user base which
> is based on opinion rather than fact.  More advanced users are currently
> hindered by the conservatism of the project and so naturally have stopped
> trying to submit changes that alter the core non-public code.
> The rancor is from users would benefit from a faster pace and the ability
> to be more creative inside the core Lucene system.  As the internals change
> frequently and unnannounced the process of developing core patches is
> difficult and frustrating.
> Now that Lucene is stable and flexible indexing is being implemented.  It
> would benefit the community to focus on the future.  Who exactly is
> responsible for this?  Which of the committers are building for the future?
> Which are doing bug fixes?  What is the process of developing more advanced
> features in open source?  Right now it seems to be one person, Michael
> McCandless developing all of the new core code.  This is great forward
> progress, however it's unclear how others can get involved and not get
> stampeded by the constant changes that all happen via one brilliant person.
> I have requested of people such as Michael Busch to collaborate on the
> column stride fields and received no response.
> To me, an good example of volunteers are people who prepare food and donate
> their time at soup kitchens with no pay, and no hope for pay related to
> feeding the hungry.
> -J
> On Wed, Dec 3, 2008 at 2:52 PM, Grant Ingersoll <>wrote:
>> On Dec 3, 2008, at 2:27 PM, Jason Rutherglen (JIRA) wrote:
>>> Hoss wrote: "sort of mythical "Lucene powerhouse"
>>> Lucene seems to run itself quite differently than other open source Java
>>> projects.  Perhaps it would be good to spell out the reasons for the
>>> reluctance to move ahead with features that developers work on, that work,
>>> but do not go in.  The developer contributions seem to be quite low right
>>> now, especially compared to neighbor projects such as Hadoop.  Is this
>>> because fewer people are using Lucene?  Or is it due to the reluctance to
>>> work with the developer community?  Unfortunately the perception in the eyes
>>> of some people who work on search related projects it is the latter.
>> Or, could it be that Hadoop is relatively new and in vogue at the moment,
>> very malleable and buggy(?) and has a HUGE corporate sponsor who dedicates
>> lots of resources to it on a full time basis, whilst Lucene has been around
>> in the ASF for 7+ years (and 12+ years total) and has a really large install
>> base and thus must move more deliberately and basically has 1 person who
>> gets to work on it full time while the rest of us pretty much volunteer?
>>  That's not an excuse, it's just the way it is.  I personally, would love to
>> work on Lucene all day every day as I have a lot of things I'd love to
>> engage the community on, but the fact is I'm not paid to do that, so I give
>> what I can when I can.  I know most of the other committers are that way
>> too.
>> Thus, I don't think any one of us has a reluctance to move ahead with
>> features or bug fixes.   Looking at CHANGES.txt, I see a lot of
>> contributors.  Looking at java-dev and JIRA, I see lots of engagement with
>> the community.  Is it near the historical high for traffic, no it's not, but
>> that isn't necessarily a bad thing.  I think it's a sign that Lucene is
>> pretty stable.
>> What we do have a reluctance for are patches that don't have tests (i.e.
>> this one), patches that massively change Lucene APIs in non-trivial ways or
>> break back compatibility or are not kept up to date.  Are we perfect?  Of
>> course not.  I, personally, would love for there to be a way that helps us
>> process a larger volume of patches (note, I didn't say commit a larger
>> volume).  Hadoop's automated patch tester would be a huge start in that, but
>> at the end of the day, Lucene still works the way all ASF projects do: via
>> meritocracy and volunteerism.     You want stuff committed, keep it up to
>> date, make it manageable to review, document it, respond to
>> questions/concerns with answers as best you can.  To that end, a real simple
>> question can go a long way and getting something committed, and it simply
>> is:  "Hey Lucener's,  what else can I do to help you review and commit
>> LUCENE-XXXX?"  Lather, rinse, repeat.   Next thing you know, you'll be on
>> the receiving end as a committer.
>> -Grant
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

View raw message