lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Busch <>
Subject Re: Factor out a standalone, shared analysis package for Nutch/Solr/Lucene?
Date Mon, 01 Mar 2010 05:05:51 GMT
On 2/28/10 4:30 PM, Grant Ingersoll wrote:
> Not sure why more tests would be a negative.  The Solr tests exercise quite a bit of
Lucene functionality as well.
> -Grant

Sorry, I should have made myself clearer here. It'd obviously be silly 
to argue against more test coverage. In general I think it's a great 
idea to run the Solr tests also when testing a Lucene patch.

I'm just not happy about making this a formal requirement (that Solr 
tests have to pass in order to commit a Lucene patch). All 
backwards-incompatible patches, which we had quite a few of in 2.9 and 
3.0, would then become even more difficult to commit, because you have 
to make all changes then in Solr too as part of the Lucene patch. Think 
about changes like per-segment search or the new TokenStream API and how 
difficult and time consuming they were for core and contrib already. For 
backwards-compatible changes, by all means, let's run as many tests as 
we can.

We have all been saying we want to have more frequent releases. Right 
now Lucene has no external dependencies that could slow down a release 
and still we don't release as frequently as we'd like to. If we add 
dependencies like release alignment with subprojects I'm afraid this 
will become worse.

I was really happy about the original idea of having a separate analyzer 
module (or subproject, library, whatever name it'd have), because 
analysis seems quite separate from indexing/search. Separating the two 
seems logical. And why not release such an analyzer package more 
frequently than Lucene. Different pieces of code don't all move with the 
same pace. It'd be nice to have the freedom of releasing an analyzer 
library after e.g. a new language was added, maybe even only two weeks 
after the previous release. IMO more modular release cycles is a better 
way to go than this new proposal.

I'd be happy if the Solr developers would be more involved in Lucene 
(again) and if we would discuss new ideas with the question in mind, 
where the new feature should live. And also the Lucene developers who 
are not very involved in Solr should understand the impact that Lucene 
changes have on Solr. So big +1 for better communication between Solr 
and Lucene devs!


View raw message