lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: [VOTE] merge lucene/solr development
Date Thu, 04 Mar 2010 10:27:50 GMT
On Thu, Mar 4, 2010 at 2:19 AM, Uwe Schindler <uwe@thetaphi.de> wrote:
> Hi,
>
> -1 on the current VOTE, as I am thinking the same like Michael Busch and Bill Au:
>
> - I am fine with merging development mailing lists (not user mailing
>   lists).

OK.

> - But I do not want to enforce releases to appear at the same time,
>   so there must be some coordination with the fact that "Solr
>   depends on Lucene but NOT Lucene depends on Solr".

Maybe we could release Lucene but not necessarily Solr?  EG say with
segment-based search, we cutover, and Solr tests all pass (because the
change was technically back-compat), but Solr has poor performance
because it still sometimes searches at the MultiReader level, yet,
Lucene is stable.  So we release Lucene, but only release Solr once it has
more fully cutover?  Would this be possible?

Likewise in the analyzers example -- if contrib/analyzers why not
separately release it as important bug fixes happen?

I'm not saying we'd do this across the board for all modules, but it's
a possibility.

Also, remember that as a combined dev community, we all want what's
best for users (faster releases), so we all can brainstorm together
about how to make that possible.  I've not heard anybody wanting
slower releases.  This is why I see release frequency as really a
matter of discipline.  If we will take it seriously, we should appoint
a release czar on each cycle.

> - A modularization is needed: Lucene-Core (with no analyzers at all,
>   only abstract classes), Lucene-Analysis, Lucene-Facetting,
>   Lucene-FunctionQueries, Lucene-Foobar, Solr-Core, Solr-Foobar,...

I'm a huge fan of doing this as well, but, I see merging Solr/Lucene
dev as a huge step towards making this modularization possible.

Maybe we can do this up front, when we merge?  EG moving concrete
queries, analyzers out of core (keeping abstract base classes), moving
queryParser out, seem like obvious first steps.  Hey then Lucene's JAR
will be less than 1.0 MB again!

> - No requirement for Lucene Committers to work on Solr Tests or that
>   Solr tests must pass when Lucene Changes. I would like to have it
>   more in a way that the issue tracker would do that like it is now:
>   Lucene is enhanced, BW layer still alive (so solr tests should
>   work), so open issue against solr referring to lucene issue to fix
>   solr and remove usage of deprecated methods or fix other problems.

First off, not breaking tests seems like a simple win?  Ie, if we're
not breaking back compat, the tests all pass.  If we accidentally
break back compat, the tests fail, and protect us, which is only good?

I think your real concern might be that you (and other "I want to
focus only on core Lucene devs") might be forced to spend alot of time
working on Solr when new changes happen in Lucene?

I would argue that this won't happen.  Or, it will only happen if you
want to work on Solr.  Think about the segment based search example --
Lucene can cutover to segment based search, pass all tests.  But then
a separate issue would be opened and likely a separate person (say
Yonik) would work on it, to have Solr properly take advantage of /
expose the new feature.

Now some features will be driven by someone wearing a Solr hat.  EG
Chris/Grant's (& other's) push to get spatial working well with Solr.
If this push were done in with shared development then it would be
created/iterated in both a Solr and Lucene friendly way, with a single
source.

And take our work in flex -- if we're doing our jobs, landing flex will
already pass all Solr tests.  I *really* want to have Solr tests confirm
that we didn't break back-compat.  But I can't, now (Solr's not on trunk).

Does that mean I must go and fix Solr to make it possible to specify a
custom codec through schema.xml?  No.  Exposing flex in Solr, NRT in
Solr, is always going to be a separate step.

The combined dev community would have no requirement/expectation that
if someone adds something cool to Lucene they must also expose it in
Solr.  There will still be devs that wear mostly Solr vs most Lucene
hats.  There will also be devs that comfortably wear both.  There will
be devs that focus on analyzers and do amazing things ;)

> - And last but not least the whole merge should be done *after* the
>   current code bases are again closer to each other, especially Flex
>   is in and Solr is at least on Lucene 3.0.1.

Well, this really is a logistical question (ie not really a reason for
casting a -1 vote).

But, yes, we have to get Solr upgraded to Lucene trunk before we can
merge development -- there's no way around that.

On requiring flex to land first... I'm now agnostic.  I suspect landing
flex won't be that much harder before vs after.  So if we can't finish
up flex in time (I have ~88 nocommits left ;), I don't think that
should block merging Solr/Lucene dev.

Mike

Mime
View raw message