lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Payloads and TrieRangeQuery
Date Fri, 12 Jun 2009 21:33:53 GMT

On Jun 12, 2009, at 12:20 PM, Michael McCandless wrote:

> On Thu, Jun 11, 2009 at 4:58 PM, Yonik Seeley< 
> > wrote:
>> In Solr land we can quickly hack something together, spend some time
>> thinking about the external HTTP interface, and immediately make it
>> available to users (those using nightlies at least).  It would be a
>> huge burden to say to Solr that anything of interest to the Lucene
>> community should be pulled out into a module that Solr should then
>> use.
> Sure, new and exciting things should still stay private to Solr...
>> As a separate project, Solr is (and should be) free to follow
>> what's in it's own best interest.
> Of course!
> I see your point, that moving things down into Lucene is added cost:
> we have to get consensus that it's a good thing to move (but should
> not be hard for many things), do all the mechanics to "transplant" the
> code, take Lucene's "different" requirements into account (that the
> consumability & stability of the Java API is important), etc.

The problem traditionally has been that people only do the work one  
way.   That is, they take it from Solr, but then they never submit  
patches to Solr to use the version in Lucene.  And, since many of the  
Lucene committers are not Solr committers, even if they do the Solr  
work, they can't see it through.

It seems all the pure Lucene devs want the functionality of Solr, but  
they don't want to do any of the work to remove the duplication from  
Solr.  Additionally, it is often the case that by the time it gets  
into Lucene, some Solr user has come along and improved the Solr  
version.  The Function stuff is example numero uno.

Wearing my PMC hat, I'd say if people are going to be moving stuff  
around like this, then they better be keeping Solr up to date, too,  
because it is otherwise creating a lot of work for Solr to the  
detriment of it (because that time could be spent doing other  
things).  Still, I don't think that is all that worthwhile, as it will  
just create a ton of extra work.  People who want Solr stuff are free  
to pull what they need into their project.  There is absolutely  
nothing stopping them.

And the fact is, that no matter how much is pulled out of Solr, people  
will still contribute things to Solr because it is it's own community  
and is fairly autonomous, a few committers that cross over not  
withstanding.  I'd venture a fair number of Solr committers know  
little about Lucene internals.  Heck, given the amount of work you do,  
Mike, I'd say a fair number of Lucene committers know very little  
about the internals of Lucene anymore.  It has been good to see you  
over in Solr land at least watching what is going on there to at least  
help coordinate when Solr finds Lucene errors.

> But, there is a huge benefit to having it in Lucene: you get a wider
> community involved to help further improve it, you make Lucene
> stronger which improves its & Solr's adoption, etc.

That is not always the case.  Pushing things into Lucene from Solr  
make it harder for Solr committers to do their work, unless you are  
proposing that all Solr committers should be Lucene committers.

As for adoption, most people probably should just be starting with  
Solr anyway.  The fact is that every Lucene committer to the tee will  
tell you that they have built something that more or less looks like  
Solr.  Lucene is great as a low-level Vector Space implementation with  
some nice contribs, but much of the interesting stuff in search these  
days happens at the layer up (and arguably even a layer above that in  
terms of UI and intelligent search, etc).  In Lucene PMC land, that  
area is Solr and Nutch.  My personal opinion is that Lucene should  
focus on being a really fast, core search library and that the outlet  
for the higher level stuff is in Solr and Nutch.  It is usually  
obvious when things belong in the core, because people bring them up  
in the appropriate place (there are some rare exceptions, that you  
have mentioned)


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message