lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: Factor out a standalone, shared analysis package for Nutch/Solr/Lucene?
Date Mon, 01 Mar 2010 13:25:52 GMT

> On 2/28/10 9:05 PM, Michael Busch wrote:
>>  Think about changes like per-segment search or the new TokenStream 
>> API and how difficult and time consuming they were for core and 
>> contrib already.
1. Its not just more work for the same Lucene devs - there would be more 
devs with a merge to work on these things. More devs that stay more in 
Solr land would probably have been more involved in these changes 
earlier in Lucene land with merged projects.

2. Solr found a bunch of issues with the TokenStream API. Might not be 
such a bad idea for such large changes to have to go through that. Solr 
also exposed issues that other users were going to have to face with per 
segment - might be good to be forced to face that as well.

3. It's already been mentioned that you wouldn't have to do the Solr 
part to add the Lucene part. You'd likely have been able to do the same 
thing - create the new API, get the backwards compat pieces in, and then 
create a JIRA issue to get it done for Solr. Then later, Robert and 
Yonik would have done most of the work - similar to how things worked 
anyway. At least getting Solr tests to pass seems like a nice way to 
keep Lucene honest - you have to think about and see your changes play 
out in actual use. It doesn't mean you actually have to do all of the 
work to get Solr completely up to speed. If the TokenStream API was 
perfect, it wouldn't have broken Solr tests. Per segment is a much more 
rare situation.

 >I'd be happy if the Solr developers would be more involved in Lucene 
(again) and if we would discuss new ideas with the question in mind, 
where >the new feature should live. And also the Lucene developers who 
are not very involved in Solr should understand the impact that Lucene 
changes >have on Solr. So big +1 for better communication between Solr 
and Lucene devs!

Again - same way we'd all like there to be more frequent releases. I'd 
bet a fortune its not going to happen based on what we'd "like" to see. 
I see a solution to getting this done being proposed though.

My main concern still, is the complication of releasing together, and 
how that is going to affect release frequency. Other than that, I've 
only seen wins for the quality of both projects. Most of the arguments 
against are assuming the merge means more than it does I think. Lucene 
will still be a library separate from Solr. People contribing to Lucene 
will not be required to do the Solr piece. This just moves us along the 
path of what Michael says he'd like to see above - and what I think most 
of us would like to see. We have learned from the past though - these 
things would like never happen without real change being implemented.

Moving to a +1 from me.

- Mark

View raw message