lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Busch <busch...@gmail.com>
Subject Re: Factor out a standalone, shared analysis package for Nutch/Solr/Lucene?
Date Mon, 01 Mar 2010 05:26:39 GMT
I forgot to mention: I admittedly haven't been very involved in Solr in 
the past. So I'm probably not aware of many of the problems Solr might 
have had with staying in sync with Lucene. If everyone here agrees with 
Yonik/Mike's proposal I will not try to block it with a -1 veto. I'm 
just trying to express here the concerns that come to my mind. To do 
what's best for the future of the Lucene TLP as a whole is of course my 
main interest too.

And I really really want to still be able to use Lucene separately as a 
library, and I think we all agree here!

  Michael


On 2/28/10 9:05 PM, Michael Busch wrote:
> On 2/28/10 4:30 PM, Grant Ingersoll wrote:
>> Not sure why more tests would be a negative.  The Solr tests exercise 
>> quite a bit of Lucene functionality as well.
>>
>> -Grant
>
> Sorry, I should have made myself clearer here. It'd obviously be silly 
> to argue against more test coverage. In general I think it's a great 
> idea to run the Solr tests also when testing a Lucene patch.
>
> I'm just not happy about making this a formal requirement (that Solr 
> tests have to pass in order to commit a Lucene patch). All 
> backwards-incompatible patches, which we had quite a few of in 2.9 and 
> 3.0, would then become even more difficult to commit, because you have 
> to make all changes then in Solr too as part of the Lucene patch. 
> Think about changes like per-segment search or the new TokenStream API 
> and how difficult and time consuming they were for core and contrib 
> already. For backwards-compatible changes, by all means, let's run as 
> many tests as we can.
>
> We have all been saying we want to have more frequent releases. Right 
> now Lucene has no external dependencies that could slow down a release 
> and still we don't release as frequently as we'd like to. If we add 
> dependencies like release alignment with subprojects I'm afraid this 
> will become worse.
>
> I was really happy about the original idea of having a separate 
> analyzer module (or subproject, library, whatever name it'd have), 
> because analysis seems quite separate from indexing/search. Separating 
> the two seems logical. And why not release such an analyzer package 
> more frequently than Lucene. Different pieces of code don't all move 
> with the same pace. It'd be nice to have the freedom of releasing an 
> analyzer library after e.g. a new language was added, maybe even only 
> two weeks after the previous release. IMO more modular release cycles 
> is a better way to go than this new proposal.
>
> I'd be happy if the Solr developers would be more involved in Lucene 
> (again) and if we would discuss new ideas with the question in mind, 
> where the new feature should live. And also the Lucene developers who 
> are not very involved in Solr should understand the impact that Lucene 
> changes have on Solr. So big +1 for better communication between Solr 
> and Lucene devs!
>
>  Michael


Mime
View raw message