lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <>
Subject Re: more sandbox questions
Date Mon, 07 Feb 2005 19:11:32 GMT
Erik Hatcher wrote:

> On Feb 7, 2005, at 1:21 AM, David Spencer wrote:
>> Erik Hatcher wrote:
>>> XML-Indexing-Demo - I propose this be moved to an "examples" area if 
>>> we keep it at all.
>>> parsers - Is anyone using the PDF parser here?
>>> taglib - my bad in committing this in the first place - its not well 
>>> implemented and of marginal use.  I propose to remove it entirely.
>>> miscellaneous - I propose that when moved to contrib/util.
>>> similarity & spellchecker - I propose this be combined with the 
>>> contrib/util.
>>> Thoughts on these?
>> Another way of looking at it is to group query expansion code together 
>> i.e. similarity + spellchecker + wordnet go together. I think calling 
>> things "util" or "misc" demeans them - but disclaimer, these 3 things 
>> are coincidentally all mine.
> No offense or demeaning intended.

None taken! Sorry, I should have made that clear.
I agree w/ trying to make sense of the packaging as that gives Lucene 
more value.

>  I wasn't that happy with an umbrella 
> "util" area myself, but also am trying to ensure we have a clean and 
> sensible contrib area.  Keep in mind that the idea is package each 
> contrib project as its own separate package within the Lucene 
> distribution.  So highlighter, with the Lucene 2.0 release, would be 
> packaged as highlighter-2.0.jar.  The WordNet package is unique in that 
> it is not something  you add-on to an application using Lucene, but 
> rather a tool that is used to generate an index for use with your 

This may not be quite precise - the WordNet pkg does 2 things, [1] 
builds a synonym index and [2] expands queries. [2] is done in

Thus I thought it would make sense to think of a "query expansion" 
module and group this + the similarity stuff...

> application.  I'm not sure how these distinctions factor into how we 
> package things.
>>> The contrib area should be useful add-ons to Lucene's core, and isn't 
>>> really appropriate for examples/demos, it seems to me.
>>> The tricky pieces are miscellaneous, similarity, and spellchecker.  
>>> These are tiny by themselves and putting them in a util area and 
>>> packaging them altogether seems ok to me at one level, but does it 
>>> make more sense to keep these completely separate?
>> OK, to be more concrete, I'll suggest the 3 above go to "search" or 
>> "query-expansion".
> "search" is too generic, it seems, since all of Lucene could fit under 
> that categorization.  Maybe it makes the most sense to leave them as-is 
> for the time being - though keeping it open for discussion is good to 
> see what others think.
>     Erik
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message