lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: more sandbox questions
Date Mon, 07 Feb 2005 14:03:18 GMT
On Feb 7, 2005, at 1:21 AM, David Spencer wrote:
> Erik Hatcher wrote:
>> XML-Indexing-Demo - I propose this be moved to an "examples" area if 
>> we keep it at all.
>> parsers - Is anyone using the PDF parser here?
>> taglib - my bad in committing this in the first place - its not well 
>> implemented and of marginal use.  I propose to remove it entirely.
>> miscellaneous - I propose that when moved to contrib/util.
>> similarity & spellchecker - I propose this be combined with the 
>> contrib/util.
>> Thoughts on these?
> Another way of looking at it is to group query expansion code together 
> i.e. similarity + spellchecker + wordnet go together. I think calling 
> things "util" or "misc" demeans them - but disclaimer, these 3 things 
> are coincidentally all mine.

No offense or demeaning intended.  I wasn't that happy with an umbrella 
"util" area myself, but also am trying to ensure we have a clean and 
sensible contrib area.  Keep in mind that the idea is package each 
contrib project as its own separate package within the Lucene 
distribution.  So highlighter, with the Lucene 2.0 release, would be 
packaged as highlighter-2.0.jar.  The WordNet package is unique in that 
it is not something  you add-on to an application using Lucene, but 
rather a tool that is used to generate an index for use with your 
application.  I'm not sure how these distinctions factor into how we 
package things.

>> The contrib area should be useful add-ons to Lucene's core, and isn't 
>> really appropriate for examples/demos, it seems to me.
>> The tricky pieces are miscellaneous, similarity, and spellchecker.  
>> These are tiny by themselves and putting them in a util area and 
>> packaging them altogether seems ok to me at one level, but does it 
>> make more sense to keep these completely separate?
> OK, to be more concrete, I'll suggest the 3 above go to "search" or 
> "query-expansion".

"search" is too generic, it seems, since all of Lucene could fit under 
that categorization.  Maybe it makes the most sense to leave them as-is 
for the time being - though keeping it open for discussion is good to 
see what others think.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message