lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Busch <>
Subject Re: Modularization
Date Sat, 21 Mar 2009 11:34:30 GMT
On 3/21/09 11:26 AM, Michael McCandless wrote:
> I think we are mixing up source code modularity with
> bundling/packaging.
> Honestly, I would not mind much where the source code lives in svn, so
> long as a developer, upon downloading Lucene 2.9, can go to *one*
> place (javadocs) for Lucene's "queries & filters" and see
> {Int,Long}NumberRangeFilter in there.
> We are not there today: a developer must first realize there's a whole
> separate place to look for "other" queries (contrib/queries).  Then
> the developer browses that and likely becomes confused/misled by what
> TrieRangeQuery means (is it a letter trie?).
> My goal here is Lucene's consumability -- when someone new says "hey I
> heard about this great search library called Lucene; let me go try it
> out" I want that first impression to be as solid as possible.  I think
> this is very important for growing Lucene's community.  This is why
> "out of the box" defaults are so crucial (eg changing IW from flushing
> every 10 docs to every 16 MB gained sizable throughput).
So this guy landing on 
sees the "Overview" section first. That one only gives a very short 
introduction to what Lucene is. He might then look at "Features", which 
is also not very specific. I think the next thing would then be to look 
for the documentation of the newest release, so he would click on 
"Lucene 2.4.1 Documentation". The landing page doesn't say much, except 
tells you to go look for the javadocs and other docs in the menu. So 
maybe the "Getting Started" link might the first one to go to, but it's 
also pretty far down the list. So probably he would click on the 
javadocs first. Now he encounters "All, Core, Demo, Contrib". Until now, 
he hasn't read the word "Contrib" anywhere. We basically have nowhere 
documentation that introduces the concept of contribs, or where to find 
them, I think? Even the "Contributions" section talks about something 
else. So that guy probably looks then trough the  demo and examples and 
ends up using only core features until becoming more familiar with 
Lucene as a whole. Maybe he actually ends up buying LIA(2) :)

> How many times have we seen a review, article, blog post, etc.,
> comparing Lucene to other search libraries only to incorrectly
> complain because "Lucene can't do XYZ" or "Lucene's indexing
> performance is poor", etc, because they didn't dig in to learn all the
> tunings/options/tricks we all know you are supposed to do?  (It
> frustrates me to end when this happens).  This then hurts Lucene's
> adoption because others read such articles and conclude Lucene is a
> non-starter.
> We all ought to be concerned with Lucene's adoption & growth with time
> (I am), and first-impression consumability / out of the box defaults
> are big drivers of that.
> point?) we change how Lucene is bundled, such that core queries and
> contrib/query/* are in one JAR (lucene-query-3.0.jar)?  And
> lucene-analyzers-3.0.jar would include contrib/analyzers/* and
> org/apache/lucene/analysis/*.  And lucene-queryparser.jar, etc.

So yeah I like this and 3.0 is a good opportunity to do this. I think a 
big part of this work should be good documentation. As you mentioned, 
Mike, it should be very simple to get an overview of what the different 
modules are. So there should be the list of the different modules, 
together with a short description for each of them and infos about where 
to find them (which jar). Then by clicking on e.g. queries, the user 
would see the list of all queries we support.

But I think we should still have "main modules", such as core, queries, 
analyzers, ... and separately e.g. "sandbox modules?", for the things 
currently in contrib that are experimental or, as Mark called them, 
"graveyard contribs" :) ... even though we might then as well ask the 
questions if we can not really bury the latter ones...

> Mike
> Michael Busch wrote:
>> On 3/21/09 12:27 AM, Michael Busch wrote:
>>> +1. I'd love to see Lucene going into such a direction.
>>> However, I'm a little worried about contrib's reputation. I think it 
>>> contains components with differing levels of activity, maturity and 
>>> support.
>>> So maybe instead of moving things from core into contrib to achieve 
>>> the goal you mentioned, we could create a new folder named e.g. 
>>> 'components', which will contain stuff that we claim is as stable, 
>>> mature and supported as the core, just packaged into separate jars. 
>>> Those jars should then only have dependencies on the core, but not 
>>> on each other. They would also follow the same 
>>> backwards-compatibility and other requirements as the core. Thoughts?
>> I guess something very similar has been proposed and discussed here: 

>> (same link that Hoss sent while having his deja vu)...
>> -Michael
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message